pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
15.99k stars 6.92k forks source link

add typing to torchvision.models.detection.faster_rcnn #8600

Open scm-aiml opened 4 weeks ago

scm-aiml commented 4 weeks ago

🚀 The feature

In support of #2025, add type hinting to torchvision/models/detection/faster_rcnn

Motivation, pitch

In an effort to get type hinting throughout torchvision, I wanted to start contributing small where I could.

Alternatives

Not needed

Additional context

No response

scm-aiml commented 4 weeks ago

I'm more than happy to follow the contribution guide (already started), but wasn't sure if I should open an issue or not.

One issue I did find is some inconsistency between variable types used for image mean and standard deviation. Specifically in FasterRCNN the typing uses Tuple[float, float, float] (Link) but if it is not defined then a set of values are defined in the code and it assigns a list

        if image_mean is None:
            image_mean = [0.485, 0.456, 0.406]
        if image_std is None:
            image_std = [0.229, 0.224, 0.225]
        transform = GeneralizedRCNNTransform(min_size, max_size, image_mean, image_std, **kwargs)

GeneralizedRCNNTransform expects a List.

My recommendation is that in this case a Tuple[float, float, float] is more precise in the case that you are expecting 3 channel values, and there wouldn't be expected reason to modify those values inside the function.

NicolasHug commented 3 weeks ago

Hi @scm-aiml , thank you for opening this issue. As I replied on https://github.com/pytorch/vision/issues/2025, we're not planning on adding more type annotations to torchvision, sorry.