Open ichu-apl opened 8 months ago
This can be fixed by just running the required function, but it lacks documentation. Either this requirement should be dropped, or the pretrained_weights + checkpoint conflict note should be removed (it doesn't look like it'd result in undefined behavior, the code comments seem to expect and account for it).
processing_params = get_pretrained_processing_params(model_type, "coco")
self.net.set_dataset_processing_params(**processing_params)
wait, there isn't a conflict and running model.get() with both checkpoint_path and pretrained_weights solves the issue. This note in https://github.com/Deci-AI/super-gradients/blob/master/src/super_gradients/training/models/model_factory.py#L225
should be removed:
NOTE: Passing pretrained_weights and checkpoint_path is ill-defined and will raise an error.
@BloodAxe this is the weirdest API i've seen to a predict method. It's not even finetuning friendly.
When exporting with preprocessing=False
, it doesnt just remove the preprocessing, it also makes it so that it does not to NMS which is really a weird choice since NMS is postprocessing.
ModelHasNoPreprocessingParamsException
Moreover, you get this error asking for preprocessing when literally the documentation doesnt point to anything.
RuntimeError: You must set the dataset processing parameters before calling predict.
Please call `model.set_dataset_processing_params(...)` first.
So...
checkpoint_path=best_weights_path, pretrained_weights='coco'
@james-imi I'm not sure I'm following your complains.
Indeed, to use model.predict
model must know what image preprocessing steps to make in order to apply same image resizing/padding/normalization. How model knows that? It tries to pull this meta-information from the model checkpoint. With this information missing, predict()
cannot work. I hope it's all clear to this point.
How preprocessing meta-information appear in the checkpoint? During model training. It is extract from the validation dataset's transforms and saved as additional metadata next to model weights. A model.set_dataset_processing_params(...)
is part of internal API and doing exactly that.
Under normal circumstances this works under the hood and one does not need any additional actions apart from normal model train / model.get
step.
Throwing random pieces of code is rarely helps to address your issues. It is important to provide as much information as possible including SG version you are using and code snippet that we can use to reproduce the issue.
You can check best practices on training model, export it and using predict() by checking our example notebooks:
It tries to pull this meta-information from the model checkpoint. With this information missing, predict() cannot work. I hope it's all clear to this point.
I'm pretty sure that the complain before by another user is the SAME point. Finetuned models do not have these information if you are using your custom dataset, hence it cannot pull it out.
During model training. It is extract from the validation dataset's transforms and saved as additional metadata next to model weights. A model.set_dataset_processing_params(...) is part of internal API and doing exactly that.
Yes. And that was the same point. With custom datasets, it is not getting SAVED as additional metadata hence the issue was raised. Maybe there's a BUG in the code perhaps?
With that out in mind, your documentation says, "using preprocessing=False" also removes NMS which is very confusing since NMS is a postprocessing technique.; So I am not sure if preprocessing=False means it removes the preprocessing step (resizing, etc) and the NMS when exporting to ONNX.
🐛 Describe the bug
I tried to load from checkpoint using the file initially downloaded by the models.get() call. It looks like there's a leftover TODO in the repo code that's a reminder to remove set_detection_processing_params() requirement for models loaded from checkpoint.
Minimal Example
Error Traceback:
Versions
Collecting environment information... PyTorch version: 2.2.1+cpu Is debug build: False CUDA used to build PyTorch: Could not collect ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 Enterprise GCC version: (Rev10, Built by MSYS2 project) 12.2.0 Clang version: Could not collect CMake version: version 3.24.0-rc3 Libc version: N/A
Python version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.19045-SP0 Is CUDA available: False CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: N/A GPU models and configuration: GPU 0: NVIDIA GeForce MX450 Nvidia driver version: 511.99 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Architecture=9 CurrentClockSpeed=2496 DeviceID=CPU0 Family=198 L2CacheSize=10240 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2496 Name=11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz ProcessorType=3 Revision=
Versions of relevant libraries: [pip3] numpy==1.23.0 [pip3] onnx==1.13.0 [pip3] onnxruntime==1.13.1 [pip3] onnxruntime-gpu==1.17.1 [pip3] onnxsim==0.4.36 [pip3] pytorch-lightning==2.2.1 [pip3] torch==2.2.1 [pip3] torchdata==0.7.1 [pip3] torchmetrics==0.8.0 [pip3] torchtext==0.17.1 [pip3] torchvision==0.17.1 [conda] Could not collect