Open juliusbierk opened 4 years ago
Btw. CUDA seems to stuck on the __exit__
part of with Tape
, i.e. when calculating the gradients of paint()
.
Aha, found the problem.
Apparently gradients do not support the "smart indexing" used in the for loops.
Replacing paint
with
@ti.kernel
def paint(t: ti.f32):
for i in range(n * 2):
for j in range(n):
loss[None] += pixels[i, j] * pixels[i, j]
allows it to run on the gpu.
This is strange, the example in the documentation suggest that smart indexing is the way to go: https://taichi.readthedocs.io/en/stable/hello.html
Also, from the documentation, then 2nd version of paint
should be slower because only the outermost scope (in your case the loop for i in range(n * 2)
) would be parallelized https://taichi.readthedocs.io/en/stable/hello.html#parallel-for-loops
A small observation: You are using ti.var
, the example uses ti.field
. I cannot find anything about ti.var
in the documentation. What is ti.var
?
I am trying to get my head around the examples, so I cannot help much more but hope it points you out in the right direction.
Hi @robertour . Thanks for your reply. I opened this issue back in February... I'm sure many things could have changed since then (e.g. ti.var
no longer being used). Perhaps it also just works now.
First of all: cool library. I am trying to familiarize myself with it.
I tried to make just a simple example. This code makes an image with a black-white gradient, and using a loss functions to darken the image. It runs fast on cpu, but cannot even render the first frame on the GPU (using an RTX 2080 Ti). It keeps the GPU at 100 % utilization, but nothing happens. I can run other examples just fine on the GPU.
Are there any glaring misunderstandings that I have gotten?
Thank you in advance for your help.