Hi artyom,
I've tried using dlprimitives in a diffusion model to work with the MNIST dataset. And my code has the following bug:
(torch) C:\deps\pytorch_dlprim\Diffusion>python train_mnist.py --device=ocl:0
Accessing device #0:NVIDIA GeForce RTX 2060 on NVIDIA CUDA
C:\deps\pytorch_dlprim\Diffusion\model.py:63: UserWarning: The operator 'aten::gather.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
return self.sqrt_alphas_cumprod.gather(-1,t).reshape(x_0.shape[0],1,1,1)x_0+ \
C:\deps\pytorch_dlprim\Diffusion\model.py:64: UserWarning: The operator 'aten::gather.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
self.sqrt_one_minus_alphas_cumprod.gather(-1,t).reshape(x_0.shape[0],1,1,1)noise
C:\venv\torch\lib\site-packages\torch\nn\functional.py:2210: UserWarning: The operator 'aten::index_select' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
C:\venv\torch\lib\site-packages\torch\nn\functional.py:3950: UserWarning: The operator 'aten::upsample_bilinear2d.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
return torch._C._nn.upsample_bilinear2d(input, output_size, align_corners, scale_factors)
C:\venv\torch\lib\site-packages\torch\autograd__init__.py:197: UserWarning: The operator 'aten::upsample_bilinear2d_backward.grad_input' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
C:\venv\torch\lib\site-packages\torch\autograd__init__.py:197: UserWarning: The operator 'aten::embedding_dense_backward' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Epoch[1/100], Step[0/469], loss:1.13105, lr:0.00004
Epoch[1/100], Step[10/469], loss:1.10539, lr:0.00004
Epoch[1/100], Step[20/469], loss:1.08877, lr:0.00004
Epoch[1/100], Step[30/469], loss:1.06681, lr:0.00004
Epoch[1/100], Step[40/469], loss:1.04868, lr:0.00004
Epoch[1/100], Step[50/469], loss:1.03398, lr:0.00004
Epoch[1/100], Step[60/469], loss:1.02504, lr:0.00004
Epoch[1/100], Step[70/469], loss:1.01928, lr:0.00004
Epoch[1/100], Step[80/469], loss:0.99907, lr:0.00004
Epoch[1/100], Step[90/469], loss:0.98367, lr:0.00004
Epoch[1/100], Step[100/469], loss:0.96129, lr:0.00004
Epoch[1/100], Step[110/469], loss:0.95355, lr:0.00004
Epoch[1/100], Step[120/469], loss:0.94075, lr:0.00004
Epoch[1/100], Step[130/469], loss:0.92182, lr:0.00004
Epoch[1/100], Step[140/469], loss:0.89627, lr:0.00004
Epoch[1/100], Step[150/469], loss:0.89367, lr:0.00004
Epoch[1/100], Step[160/469], loss:0.86775, lr:0.00004
Epoch[1/100], Step[170/469], loss:0.86031, lr:0.00004
Epoch[1/100], Step[180/469], loss:0.83219, lr:0.00004
Epoch[1/100], Step[190/469], loss:0.81324, lr:0.00004
Epoch[1/100], Step[200/469], loss:0.79009, lr:0.00004
Epoch[1/100], Step[210/469], loss:0.78913, lr:0.00004
Epoch[1/100], Step[220/469], loss:0.76493, lr:0.00004
Epoch[1/100], Step[230/469], loss:0.75102, lr:0.00004
Epoch[1/100], Step[240/469], loss:0.74663, lr:0.00004
Epoch[1/100], Step[250/469], loss:0.71013, lr:0.00004
Epoch[1/100], Step[260/469], loss:0.70468, lr:0.00004
Epoch[1/100], Step[270/469], loss:0.69016, lr:0.00004
Epoch[1/100], Step[280/469], loss:0.69655, lr:0.00004
Epoch[1/100], Step[290/469], loss:0.67101, lr:0.00004
Epoch[1/100], Step[300/469], loss:0.68522, lr:0.00004
Epoch[1/100], Step[310/469], loss:0.65023, lr:0.00004
Epoch[1/100], Step[320/469], loss:0.65304, lr:0.00004
Epoch[1/100], Step[330/469], loss:0.64394, lr:0.00004
Epoch[1/100], Step[340/469], loss:0.62623, lr:0.00004
Epoch[1/100], Step[350/469], loss:0.61216, lr:0.00004
Epoch[1/100], Step[360/469], loss:0.61336, lr:0.00004
Epoch[1/100], Step[370/469], loss:0.59924, lr:0.00004
Epoch[1/100], Step[380/469], loss:0.58977, lr:0.00004
Epoch[1/100], Step[390/469], loss:0.59183, lr:0.00004
Epoch[1/100], Step[400/469], loss:0.57439, lr:0.00004
Epoch[1/100], Step[410/469], loss:0.56940, lr:0.00004
Epoch[1/100], Step[420/469], loss:0.56418, lr:0.00004
Epoch[1/100], Step[430/469], loss:0.55326, lr:0.00004
Epoch[1/100], Step[440/469], loss:0.55371, lr:0.00004
Epoch[1/100], Step[450/469], loss:0.53270, lr:0.00004
Epoch[1/100], Step[460/469], loss:0.53683, lr:0.00004
Epoch 1 completed in 405.32 seconds
Sampling: 0%| | 0/1000 [00:00<?, ?it/s]C:\deps\pytorch_dlprim\Diffusion\model.py:99: UserWarning: The operator 'aten::gather.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
alpha_t=self.alphas.gather(-1,t).reshape(x_t.shape[0],1,1,1)
C:\deps\pytorch_dlprim\Diffusion\model.py:100: UserWarning: The operator 'aten::gather.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
alpha_t_cumprod=self.alphas_cumprod.gather(-1,t).reshape(x_t.shape[0],1,1,1)
C:\deps\pytorch_dlprim\Diffusion\model.py:101: UserWarning: The operator 'aten::gather.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at C:\deps\pytorch_dlprim\src\tensor_ops.cpp:313.)
beta_t=self.betas.gather(-1,t).reshape(x_t.shape[0],1,1,1)
1 error generated.
Failed Program Code:
Sampling: 0%| | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\deps\pytorch_dlprim\Diffusion\train_mnist.py", line 148, in
main(args)
File "C:\deps\pytorch_dlprim\Diffusion\train_mnist.py", line 143, in main
samples = model_ema.module.sampling(args.n_samples, clipped_reverse_diffusion=not args.no_clip, device=device)
File "C:\venv\torch\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "C:\deps\pytorch_dlprim\Diffusion\model.py", line 45, in sampling
x_t=self._reverse_diffusion_with_clip(x_t,t,noise)
File "C:\venv\torch\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(args, **kwargs)
File "C:\deps\pytorch_dlprim\Diffusion\model.py", line 106, in _reverse_diffusion_with_clip
if t.min()>0:
RuntimeError: Failed to build program source pointwise_broadcast_reduce with parameters -cl-std=CL2.0 -DREDUCE_DIMS=1 -DSMALL_REDUCTION=1 -DDIMS=1 -DWG_SIZE=0 -DITEMS_PER_WI=36 -DTWO_STAGE_REDUCTION=0 log:
For device: NVIDIA GeForce RTX 2060
:322:5: error: use of undeclared identifier 'LONG'
REDUCE_INIT_ALL
^
:7:13: note: expanded from macro 'REDUCE_INIT_ALL'
reduce_y0 = LONG;
Hi artyom, I've tried using dlprimitives in a diffusion model to work with the MNIST dataset. And my code has the following bug:
Do you know how to fix this?
Here is my code: MNIST Diffusion.zip
Best wishes, Haozhe
@tangjinchuan