Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
https://www.supergradients.com
Apache License 2.0
4.59k stars 510 forks source link

Is it possible to increase the input size of the architecture? #966

Closed MehmetOKUYAR closed 1 year ago

MehmetOKUYAR commented 1 year ago

Hi, I am training Yolo NAS model. However, the input size of this architecture is 640 x 640, which causes pixel loss while working with high-resolution images. Therefore, I want to increase the input size of the architecture to 1280 x 1280. Is this possible, and how can I integrate it into the training process? Is there anyone who can help me with this? Thank you in advance.

dagshub[bot] commented 1 year ago

Join the discussion on DagsHub!

NatanBagrov commented 1 year ago

Hello @MehmetOKUYAR, This is very easy to achieve by modifying the dataset parameters. You can see a reference in the default COCO dataset params. Please note that you should also change the input size for the validation, that is found a bit below, in the same file.

See more info in the official docs.

Let me know if that helps!

I would also add that you should tune the augmentations as well, and maybe some more heavy lifting needed in terms of the architecture of the model itself (maybe start detecting from stride 16, and not 8) if you want to gain some speedup (the inference will be slower due to higher input size).

MehmetOKUYAR commented 1 year ago

Hi @NatanBagrov Okay, I understand. That's why I asked. I have a dataset, and I achieved 98% accuracy using the yolov7-e6e architecture. However, I couldn't surpass 85% using this model, probably due to the input size. But I'm excited because I've seen significant differences in detection with YOLO NAS in comparisons I've seen online sharing.

NatanBagrov commented 1 year ago

Great, please close this issue once you succeed loading your custom size data :)

hytel commented 1 year ago

I to am curious. I would like to increase the input from 640x640 to 768x768. I have changed the dataset parameters above, and I'm training now. But it doesn't look like the model size increases (e.g. training 640x640 vs 768x768 consumes the same GPU memory amount with the same batch size). I'm not worried about the final performance, I'm trying to increase the accuracy (my images are larger and I will need to move a "window" of inference over it) of the model by giving it a larger area to see for better context. What file/params should I be changing to make the actual YOLO NAS L model take advantage of a larger input dimension for inference?

leandrotorrent commented 1 year ago

Hi everyone, any news on this? I still couldn't change the dimensions for the input images.