Closed nvnvashisth closed 3 years ago
It seems like your image is RGBA. Can you convert it to RGB? or do you have to use RGBA? If latter, then try writing your own dataloader. Can you provide data and code to reproduce the error?
Ok I converted everything to RGB. I have label from [0-9] with image size 256x256. But I come across this CUDA error. Another thing, I am trying to execute in Colab.
Regarding the code, it is exactly taken from here https://www.kaggle.com/abhishek/tez-faster-and-easier-training-for-leaf-detection ;)
Loaded pretrained weights for efficientnet-b4
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-29-de0853739e51> in <module>()
11 epochs=10,
12 callbacks=[es],
---> 13 fp16=True
14 )
15 model.save("model.bin")
6 frames
/usr/local/lib/python3.6/dist-packages/tez/model/model.py in fit(self, train_dataset, valid_dataset, train_sampler, valid_sampler, device, epochs, train_bs, valid_bs, n_jobs, callbacks, fp16)
289 n_jobs=n_jobs,
290 callbacks=callbacks,
--> 291 fp16=fp16,
292 )
293
/usr/local/lib/python3.6/dist-packages/tez/model/model.py in _init_model(self, device, train_dataset, valid_dataset, train_sampler, valid_sampler, train_bs, valid_bs, n_jobs, callbacks, fp16)
81
82 if next(self.parameters()).device != device:
---> 83 self.to(device)
84
85 if self.train_loader is None:
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
610 return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
611
--> 612 return self._apply(convert)
613
614 def register_backward_hook(
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
357 def _apply(self, fn):
358 for module in self.children():
--> 359 module._apply(fn)
360
361 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
357 def _apply(self, fn):
358 for module in self.children():
--> 359 module._apply(fn)
360
361 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
379 # `with torch.no_grad():`
380 with torch.no_grad():
--> 381 param_applied = fn(param)
382 should_use_set_data = compute_should_use_set_data(param, param_applied)
383 if should_use_set_data:
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in convert(t)
608 if convert_to_format is not None and t.dim() == 4:
609 return t.to(device, dtype if t.is_floating_point() else None, non_blocking, memory_format=convert_to_format)
--> 610 return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
611
612 return self._apply(convert)
RuntimeError: CUDA error: device-side assert triggered
Code provided in examples works quite well. This seems like some problem with the model. I cant say without having data and full code to reproduce the error :)
@nvnvashisth I just added a multi-class classification example (flower classification with 104 classes). It might be useful for you: https://github.com/abhishekkrthakur/tez/blob/main/examples/image_classification/flower_classification.py
Let me know if it still doesnt work.
Code provided in examples works quite well. This seems like some problem with the model. I cant say without having data and full code to reproduce the error :)
I have the code privately on your twitter (DM). That's the only way I could figure to reach you privately.
@nvnvashisth I just added a multi-class classification example (flower classification with 104 classes). It might be useful for you: https://github.com/abhishekkrthakur/tez/blob/main/examples/image_classification/flower_classification.py
Let me know if it still doesnt work.
I'll give it a try. Thanks
@abhishekkrthakur it is so weird. I didn't really change anything and it started working. No more cuda error. Thanks for quick support.
wow. maybe you updated torch?
Not really, I was running in colab, was using the default one.
It seems like your image is RGBA. Can you convert it to RGB? or do you have to use RGBA? If latter, then try writing your own dataloader.
@abhishekkrthakur Thank you! This helped me to solve the above error.
I am trying to use this package, and it is throwing as below. I am using the same pipeline from cassava lead detection problem but on different set where image size is (256, 256)
Could you please help here.