Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
https://www.supergradients.com
Apache License 2.0
4.54k stars 496 forks source link

API inconsistency between image and video prediction #1059

Closed laszlovandenhoek closed 1 year ago

laszlovandenhoek commented 1 year ago

🚀 Feature Request

If I run predict() on an image, as per the YOLO-NAS quickstart, it yields an ImagesDetectionPrediction, which has a save() method that takes an output_folder argument:

https://github.com/Deci-AI/super-gradients/blob/a59449726e1d55f68c96b7be24f8a00a126b59e0/src/super_gradients/training/models/prediction_results.py#L182-L184

However, if I run that same predict() method on a video, it yields a VideoDetectionPrediction, whose save() method takes an output_path argument:

https://github.com/Deci-AI/super-gradients/blob/a59449726e1d55f68c96b7be24f8a00a126b59e0/src/super_gradients/training/models/prediction_results.py#L237

Considering that they mean the same thing, it was confusing to me that the method arguments are named differently, especially since it's a required parameter without a default value.

Proposed Solution (Optional)

The reason I would suggest replacing the save() method on VideoDetectionPrediction instead of the one on ImagesDetectionPrediction is that the quickstart documentation is based on an image and therefore uses the output_path form.

dagshub[bot] commented 1 year ago

Join the discussion on DagsHub!

Louis-Dupont commented 1 year ago

Hi @laszlovandenhoek , thanks for your feedback

To add some context, the reason why both have different name is because they refer to something slightly different.

Motivation for current implementation

The motivation is that when saving a single object (video or image), we can let the user set the name (with output_path) while when working with a set of images, each image needs to have a different name, and therefore we chose to let the use chose the folder only (with output_folder). Eventually, we do want to improve the logic of multiple images, because currently, the user has no control over the name of each image (i.e. f"pred_{i}.jpg).

Potential Solution

Concerning your solution, I am not sure if taking out the possibility of the user to choose the output video/single image name is a good thing. That being said, I totally agree that homogenizing the API would be great.

Maybe the right approach would be something similar to this:

ImagesDetectionPrediction(...).save(output_folder: str, ...)
ImageDetectionPrediction(...).save(output_folder: str, output_name: str = "predicted_image.jpeg", ...)
VideoDetectionPrediction(...).save(output_folder: str, output_name: str = "predicted_video.mp4", ...)

What do you think?

laszlovandenhoek commented 1 year ago

Fair enough. While we're changing method signatures, I think it would make sense to use this opportunity to also add the possibility to customize the image name in the case of ImagesDetectionPrediction. I'm no Python wizard, but perhaps something like this could work:

from typing import Callable
...
ImagesDetectionPrediction(...).save(output_folder: str, file_naming_strategy: Callable[[int], str] = lambda i: f"pred_{i}.jpg", ...)

That would maintain backwards compatibility while providing customizability going forward.

Louis-Dupont commented 1 year ago

Update; If you want to save each image with a clear and custom name, the recommended way is to simply iterate over the predictions, and save each with whatever name you want

predictions = model.predict(IMAGES)

for i, prediction in enumerate(predictions):
    name = ...  # Define the name the way you want. e.g. `name = f"my_custom_name_{i}"`
    prediction.save(output_path=name)

(As opposed to the quick way)

predictions = model.predict(IMAGES)
predictions.show()
predictions.save(output_folder="")  # Save in working directory