Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
https://www.supergradients.com
Apache License 2.0
4.54k stars 496 forks source link

RuntimeError: Input type (torch.cuda.ByteTensor) and weight type (torch.cuda.HalfTensor) should be the same #1239

Closed adnankarimjs closed 1 year ago

adnankarimjs commented 1 year ago

🐛 Describe the bug

python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:48:24] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:48:25] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:48:25] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:48:25] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:48:25] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization Predicting Video: 0%| | 0/1502 [00:00<?, ?it/s][2023-07-03 02:48:28] INFO - pipelines.py - Fusing some of the model's layers. If this takes too much memory, you can deactivate it by setting fuse_model=False Predicting Video: 0%| | 0/1502 [00:01<?, ?it/s] Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 14, in yolo_nas_l.predict("hello.mp4").show() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 99, in predict return pipeline(images) # type: ignore File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 94, in call return self.predict_video(inputs, batch_size) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 122, in predict_video return self._combine_image_prediction_to_video(result_generator, fps=fps, n_images=len(video_frames)) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 299, in _combine_image_prediction_to_video images_predictions = [image_predictions for image_predictions in tqdm(images_predictions, total=n_images, desc="Predicting Video")] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 299, in images_predictions = [image_predictions for image_predictions in tqdm(images_predictions, total=n_images, desc="Predicting Video")] File "/home/sacramentos/.local/lib/python3.10/site-packages/tqdm/std.py", line 1178, in iter for obj in iterable: File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 149, in _generate_prediction_result yield from self._generate_prediction_result_single_batch(batch_images) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 176, in _generate_prediction_result_single_batch model_output = self.model(torch_inputs) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 120, in forward return self.head(features) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_head.py", line 284, in forward return self.forward_eval(feats) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_head.py", line 241, in forward_eval pred_bboxes = batch_distance2bbox(anchor_points_inference, reg_dist_reduced_list) stride_tensor # [B, Anchors, 4] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/utils/bbox_utils.py", line 19, in batch_distance2bbox x1y1 = -lt + points RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! sacramentos@sacramentos-System-Product-Name:~/Desktop/TensorRT$ python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:51:27] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:51:29] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:51:29] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:51:29] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:51:29] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [2023-07-03 02:51:30] INFO - pipelines.py - Fusing some of the model's layers. If this takes too much memory, you can deactivate it by setting fuse_model=False Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 14, in yolo_nas_l.predict("test.png").show() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 99, in predict return pipeline(images) # type: ignore File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 96, in call return self.predict_images(inputs, batch_size) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 111, in predict_images return self._combine_image_prediction_to_images(result_generator, n_images=len(images)) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 290, in _combine_image_prediction_to_images images_predictions = [next(iter(images_predictions))] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 149, in _generate_prediction_result yield from self._generate_prediction_result_single_batch(batch_images) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 176, in _generate_prediction_result_single_batch model_output = self.model(torch_inputs) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 120, in forward return self.head(features) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_head.py", line 284, in forward return self.forward_eval(feats) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_head.py", line 241, in forward_eval pred_bboxes = batch_distance2bbox(anchor_points_inference, reg_dist_reduced_list) stride_tensor # [B, Anchors, 4] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/utils/bbox_utils.py", line 19, in batch_distance2bbox x1y1 = -lt + points RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! sacramentos@sacramentos-System-Product-Name:~/Desktop/TensorRT$ python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:51:39] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:51:40] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:51:40] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:51:40] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:51:40] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [2023-07-03 02:51:41] INFO - pipelines.py - Fusing some of the model's layers. If this takes too much memory, you can deactivate it by setting fuse_model=False Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 14, in yolo_nas_l.predict("test.png").show() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 99, in predict return pipeline(images) # type: ignore File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 96, in call return self.predict_images(inputs, batch_size) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 111, in predict_images return self._combine_image_prediction_to_images(result_generator, n_images=len(images)) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 290, in _combine_image_prediction_to_images images_predictions = [next(iter(images_predictions))] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 149, in _generate_prediction_result yield from self._generate_prediction_result_single_batch(batch_images) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 176, in _generate_prediction_result_single_batch model_output = self.model(torch_inputs) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, *kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 120, in forward return self.head(features) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_head.py", line 284, in forward return self.forward_eval(feats) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_head.py", line 241, in forward_eval pred_bboxes = batch_distance2bbox(anchor_points_inference, reg_dist_reduced_list) * stride_tensor # [B, Anchors, 4] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/utils/bbox_utils.py", line 19, in batch_distance2bbox x1y1 = -lt + points RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! sacramentos@sacramentos-System-Product-Name:~/Desktop/TensorRT$ python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:52:07] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:52:09] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:52:09] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:52:09] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:52:09] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [ WARN:0@1.690] global cap_v4l.cpp:982 open VIDEOIO(V4L2:/dev/video0): can't open camera by index [ERROR:0@1.690] global obsensor_uvc_stream_channel.cpp:156 getStreamChannelGroup Camera index out of range Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 14, in yolo_nas_l.predict_webcam() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 110, in predict_webcam pipeline.predict_webcam() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 132, in predict_webcam video_streaming = WebcamStreaming(frame_processing_fn=_draw_predictions, fps_update_frequency=1) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/utils/media/stream.py", line 33, in init raise ValueError("Could not open video capture device") ValueError: Could not open video capture device sacramentos@sacramentos-System-Product-Name:~/Desktop/TensorRT$ python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:52:29] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:52:30] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:52:30] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:52:30] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:52:30] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [ WARN:0@1.689] global cap_v4l.cpp:982 open VIDEOIO(V4L2:/dev/video0): can't open camera by index [ERROR:0@1.689] global obsensor_uvc_stream_channel.cpp:156 getStreamChannelGroup Camera index out of range Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 14, in yolo_nas_l.predict_webcam().show() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 110, in predict_webcam pipeline.predict_webcam() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 132, in predict_webcam video_streaming = WebcamStreaming(frame_processing_fn=_draw_predictions, fps_update_frequency=1) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/utils/media/stream.py", line 33, in init raise ValueError("Could not open video capture device") ValueError: Could not open video capture device sacramentos@sacramentos-System-Product-Name:~/Desktop/TensorRT$ python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:52:42] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:52:44] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:52:44] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:52:44] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:52:44] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [ WARN:0@1.697] global cap_v4l.cpp:982 open VIDEOIO(V4L2:/dev/video0): can't open camera by index [ERROR:0@1.697] global obsensor_uvc_stream_channel.cpp:156 getStreamChannelGroup Camera index out of range Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 13, in yolo_nas_l.predict_webcam().show() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/pp_yolo_e/pp_yolo_e.py", line 110, in predict_webcam pipeline.predict_webcam() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 132, in predict_webcam video_streaming = WebcamStreaming(frame_processing_fn=_draw_predictions, fps_update_frequency=1) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/utils/media/stream.py", line 33, in init raise ValueError("Could not open video capture device") ValueError: Could not open video capture device sacramentos@sacramentos-System-Product-Name:~/Desktop/TensorRT$ python3 app.py The console stream is logged into /home/sacramentos/sg_logs/console.log [2023-07-03 02:55:25] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-07-03 02:55:27] WARNING - init.py - Failed to import pytorch_quantization [2023-07-03 02:55:27] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-07-03 02:55:27] WARNING - export.py - Failed to import pytorch_quantization [2023-07-03 02:55:27] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization Downloading: "https://sghub.deci.ai/models/yolox_t_coco.pth" to /home/sacramentos/.cache/torch/hub/checkpoints/yolox_t_coco.pth 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 58.4M/58.4M [00:36<00:00, 1.70MB/s] [2023-07-03 02:56:05] INFO - pipelines.py - Fusing some of the model's layers. If this takes too much memory, you can deactivate it by setting fuse_model=False Traceback (most recent call last): File "/home/sacramentos/Desktop/TensorRT/app.py", line 14, in yolo_nas_l.predict("test.png").show() File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/yolo_base.py", line 488, in predict return pipeline(images) # type: ignore File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 96, in call return self.predict_images(inputs, batch_size) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 111, in predict_images return self._combine_image_prediction_to_images(result_generator, n_images=len(images)) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 290, in _combine_image_prediction_to_images images_predictions = [next(iter(images_predictions))] File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 149, in _generate_prediction_result yield from self._generate_prediction_result_single_batch(batch_images) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/pipelines/pipelines.py", line 176, in _generate_prediction_result_single_batch model_output = self.model(torch_inputs) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/yolo_base.py", line 507, in forward out = self._backbone(x) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/yolo_base.py", line 258, in forward return AbstractYoloBackbone.forward(self, x) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/training/models/detection_models/yolo_base.py", line 239, in forward x = layer_module(x) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/super_gradients/modules/conv_bn_act_block.py", line 84, in forward return self.act(self.bn(self.conv(x))) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/sacramentos/.local/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Input type (torch.cuda.ByteTensor) and weight type (torch.cuda.HalfTensor) should be the same


import torch
from super_gradients.training import models

device = 'cuda' if torch.cuda.is_available() else 'cpu'

if device == 'cuda':
    torch.cuda.set_device(0)  # Set the GPU device you want to use, e.g., 0

yolo_nas_l = models.get("yolox_t", pretrained_weights="coco")
yolo_nas_l = yolo_nas_l.to(device)

# Assuming "hello.mp4" is the correct path to your video file
yolo_nas_l.predict("test.png").show()```

### Versions

ollecting environment information...
PyTorch version: 2.0.1+cu118
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Clang version: Could not collect
CMake version: version 3.25.0
Libc version: glibc-2.35

Python version: 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] (64-bit runtime)
Python platform: Linux-5.19.0-45-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080
Nvidia driver version: 530.41.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.2
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_adv_train.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_ops_train.so.8
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   48 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          16
On-line CPU(s) list:             0-15
Vendor ID:                       AuthenticAMD
Model name:                      AMD Ryzen 7 5800X 8-Core Processor
CPU family:                      25
Model:                           33
Thread(s) per core:              2
Core(s) per socket:              8
Socket(s):                       1
Stepping:                        0
Frequency boost:                 enabled
CPU max MHz:                     4850.1948
CPU min MHz:                     2200.0000
BogoMIPS:                        7585.88
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization:                  AMD-V
L1d cache:                       256 KiB (8 instances)
L1i cache:                       256 KiB (8 instances)
L2 cache:                        4 MiB (8 instances)
L3 cache:                        32 MiB (1 instance)
NUMA node(s):                    1
NUMA node0 CPU(s):               0-15
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected

Versions of relevant libraries:
[pip3] numpy==1.23.0
[pip3] torch==2.0.1+cu118
[pip3] torchaudio==2.0.2+cu118
[pip3] torchmetrics==0.8.0
[pip3] torchvision==0.15.2+cu118
[pip3] triton==2.0.0
[conda] Could not collect
mlampros commented 1 year ago

I received the same error

RuntimeError: Input type (torch.cuda.ByteTensor) and weight type (torch.cuda.HalfTensor) should be the same

and by installing version 3.1.1 I was able to run my super-gradient code snippet,

pip install super-gradients==3.1.1

I assume it is related to one of the latest changes from version 3.1.2

Phyrokar commented 1 year ago

I have the same issue, hopefully it gets fixed soon

harvestingmoon commented 1 year ago

Like what @mlampros said , try downgrading it to

super-gradients==3.1.1

This should be a temporary work around until the bug gets fixed.

Phyrokar commented 1 year ago

Thanks, but there you have the slow iteration bug.

Anyway, I found a solution: The weights are upwards compatible, if you want to run the inference with e.g. 3.1.3, you have to train with the same version as well.

harvestingmoon commented 1 year ago

IE to say I have to train the model on

super-gradients==3.1.3 

as well? Because the model I trained and inference were both in the same versions and I still encountered the same error

Phyrokar commented 1 year ago

hmm ok, in my case it worked. But I think they may have some problems anyway. Since there are a few other bugs which are related have not been solved yet.

omaiyiwa commented 1 year ago

I changed the super-gradients from 3.1.2 to 3.1.1 but it doesn't seem to take effect

mlampros commented 1 year ago

there were commits to address this issue in the last days using the github version I was able to resolve my issue, i.e.

git clone https://github.com/Deci-AI/super-gradients.git
cd super-gradients
pip3 install .
BloodAxe commented 1 year ago

Fixed in https://github.com/Deci-AI/super-gradients/pull/1281 Will be available in 3.2.0