Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
https://www.supergradients.com
Apache License 2.0
4.58k stars 506 forks source link

Regarding the disparity observed between the results shown in the 'yolo_nas_custom_dataset_fine_tuning_with_qat' Colab notebook and the output derived during training on the local system #1723

Open Sumeshbaba opened 10 months ago

Sumeshbaba commented 10 months ago

💡 Your Question

First off, thanks for this project, works great for general object detection problems!

My question is regarding the results shown on the getting started Google Colab notebook titled 'Quantization Aware Training YOLONAS on Custom Dataset'.

I downloaded this notebook as an .ipynb notebook and ran the notebook without changing any parameter on my local system.

The results for the normal yolo_nas_s after training is almost identical as in the Colab notebook.

But after QAT, the results differ significantly from the colab notebook. Could you let me know what or where I am going wrong? Thank you.

I've attached screenshots of the same.

MytrainingYOLO_NAS_S_comparison Colab_YOLO_NAS_S_comparison MytrainingYOLO_NAS_S_qat Colab_YOLO_NAS_S_qat Colab_YOLO_NAS_S MytrainingYOLO_NAS_S

Versions

PyTorch version: 2.0.1+cu118 Is debug build: False CUDA used to build PyTorch: 11.8 ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Home GCC version: Could not collect Clang version: Could not collect CMake version: Could not collect Libc version: N/A

Python version: 3.8.2 (default, May 6 2020, 09:02:42) [MSC v.1916 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22621-SP0 Is CUDA available: True CUDA runtime version: 12.3.103 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060 Nvidia driver version: 546.12 cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\cudnn_ops_train64_8.dll HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture=9 CurrentClockSpeed=2500 DeviceID=CPU0 Family=205 L2CacheSize=7680 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2500 Name=12th Gen Intel(R) Core(TM) i5-12400 ProcessorType=3 Revision=

Versions of relevant libraries: [pip3] numpy==1.23.0 [pip3] pytorch-quantization==2.1.2 [pip3] torch==2.0.1+cu118 [pip3] torchaudio==2.0.2+cu118 [pip3] torchmetrics==0.8.0 [pip3] torchvision==0.15.2+cu118 [conda] libblas 3.9.0 20_win64_mkl conda-forge [conda] libcblas 3.9.0 20_win64_mkl conda-forge [conda] liblapack 3.9.0 20_win64_mkl conda-forge [conda] mkl 2023.2.0 h6a75c08_50497 conda-forge [conda] numpy 1.23.0 pypi_0 pypi [conda] pytorch-quantization 2.1.2 pypi_0 pypi [conda] torch 2.0.1+cu118 pypi_0 pypi [conda] torchaudio 2.0.2+cu118 pypi_0 pypi [conda] torchmetrics 0.8.0 pypi_0 pypi [conda] torchvision 0.15.2+cu118 pypi_0 pypi

Sumeshbaba commented 10 months ago

@BloodAxe, any update or help regarding this issue?

Thank you.

BloodAxe commented 9 months ago

Nope, didn't had time to investigate yet.

Sumeshbaba commented 9 months ago

I see. It would be great if you can look into it soon and do update me!

Thanks and have a great day