Open WilliamRoyNelson opened 1 year ago
Hi @WilliamRoyNelson , thanks for the report.
We'll be happy to use https links once the original source (http://phototour.cs.washington.edu/patches/default.htm) supports them. In the meantime, perhaps we can at least document that http
is being used in the docstring. Happy to consider a PR
🚀 The feature
Remove phototour.py to eliminate dependency on datasets hosted using HTTP instead of HTTPS.
https://github.com/pytorch/vision/blob/70a8e05a98ea8e32b98e5a09d22ab81dd3062234/torchvision/datasets/phototour.py#L37-L60
This vulnerability has been disclosed since 2020: https://github.com/418sec/huntr/pull/702
Motivation, pitch
phototour.py uses HTTP (not HTTPS) to download datasets, and is vulnerable to MITM attacks.
It may seem like a minor issue, but as tools like PyTorch become widely implemented in industry, strict security and regulatory policies come alongside. It's hard to justify allowing an easily exploitable vulnerability within a highly regulated environment.
As far back as 2018, the Chrome browser began marking websites that do not use https as "Not Secure" https://blog.chromium.org/2018/02/a-secure-web-is-here-to-stay.html This includes the website referenced in phototour
Alternatives
Additional context
No response