General changes and adding of support for more functionality

Thank you! This PR looks brilliant. I am excited to review it and merge it -- it might take a bit longer than usual due to the holidays, but I'll get to it soon.

On Tue, Dec 22, 2020 at 7:11 AM Edwin Arkel Rios notifications@github.com wrote:

Added support for 'H-14' and L'16' ViT models. Added support for downloading the models directly from Google's cloud storage. Corrected the Jax to Pytorch weights transformation. Previous methodology would lead to .pth state_dict files without the 'representation layer'. ViT('load_repr_layer'=True) would lead to an error. If only interested in inference the representation layer was unnecessary as discussed in the original paper for the Vision Transformer, but for other applications and experiments it may be useful so I added a download_convert_models.py to first download the required models, convert them with all the weights, and then you can completely tune the parameters. Added support for visualizing attention, by returning the scores values in the multi-head self-attention layers. The visualizing script was mostly taken from jeonsworld/ViT-pytorch repository. Added examples for inference (single image), and fine-tuning/training (using CIFAR-10).

You can view, comment on, or merge this pull request online at:

https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7 Commit Summary

changed convert.py, added explore-conversion_21k.py script, added logs for the conversion, added download links to download.sh and configs.py for models that were missing

restructured directory and made it so that instead of downloading pth beforehand it directly downloads them to torchhub and then converts them on the fly

restructured, deleted jax_to_pytorch and moved to utils.py and made sure that it loads the representation layer

deleted jax_to_pytorch and added the py to download the models

deleted jax_to_pytorch and combined relevant files into pytorc_pretrained_vit into utils.py

added some inference scripts and some annotations in transformer.py

added an example for cifar-10 dataset

added files and example to visualize attention, modified transformer to return head scores if given parameter visualize=True is given, otherwise functionality stays the same

changes to allow for visualization and compatibility with torchsummary, also added an example with cifar-10. changed the loading logic to allow for appropriate loading of all layers regardless of if loading fc layers with different number of classes and/or representation layer. also verified that they load properly

Update README.md

Update README.md

Update README.md

Update README.md

File Changes

M .gitignore https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-bc37d034bad564583790a46f19d807abfe519c5671395fd494d8cce506c42947 (3)

M README.md https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5 (18)

A download_convert_models.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-b6d557511de704e5e523f9170d2b7681134f15959fbbc82aad3c2988c69c3edf (6)

A examples/attention/attention_data/AttentionMap_Layer1.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-8d6077a1296ff53305ebce624dca406388ead5c2d26b8e77c3b33eb2526e1ea0 (0)

A examples/attention/attention_data/AttentionMap_Layer10.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-baf57ee4e6694e8969a1bbb57f0fc0cc1440d46ac10f48cdb8d79e5b9dc87f23 (0)

A examples/attention/attention_data/AttentionMap_Layer11.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-7410623fca133aa24dec67f5f8b562c48db8530221f2c7759da4e25d4bda8634 (0)

A examples/attention/attention_data/AttentionMap_Layer12.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-aa71f368025235775edbf1fa135fe2cf5cd10ea5c168cd871e9515b707d8952a (0)

A examples/attention/attention_data/AttentionMap_Layer2.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-b66e4ec12418fa7b4a98de8c6ea07ae41286780db2a17891572c4e0b309e21bf (0)

A examples/attention/attention_data/AttentionMap_Layer3.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-eddf0b3c2e7d30a67fdd649cb881d96a7ebcbd89a8412ff056c467400243ae53 (0)

A examples/attention/attention_data/AttentionMap_Layer4.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-e14d96b36e25c6d8510b04a40f871b10eff91bda8912cf30f334073af53d3a37 (0)

A examples/attention/attention_data/AttentionMap_Layer5.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-33d12f0f6d4b5c235d6bd6b044d1c701994f4211fe05b4e2296e88c4f0f38297 (0)

A examples/attention/attention_data/AttentionMap_Layer6.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-d0fb457cdc6e9ed5f242d621302ba263b3cb911b19d4207da1c7a6db4f7b5b4e (0)

A examples/attention/attention_data/AttentionMap_Layer7.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-e01f3d94ab9fc9556757cd15049afc59edb949911b4c513d1e70f7aa4c66cd6e (0)

A examples/attention/attention_data/AttentionMap_Layer8.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-83be2ae6d07fde33539067a3d5a5fa812ef1459ef9b2b76d5ccf48c43b97f053 (0)

A examples/attention/attention_data/AttentionMap_Layer9.png https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-f0e4bc2504540f8e5b06a4274cfdf145d86b74bd5caf520f82234dbdd44515f2 (0)

A examples/attention/attention_data/img.jpg https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-cd3dbf28ff84d824a7fa383f89cde73d7299883a0be589188d1a01d2febb60d8 (0)

A examples/attention/visualize_attention.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-56b90d04de4c98d21d127d0e27bb0eb8dce158ec2818086ad1606ca9d3b99076 (104)

A examples/cifar/train.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-35253b74776a4534554edc4841393ab3c614fab37352c86578e51b00d2799a2e (138)

A examples/cifar/train_cifar.sh https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-db3ecabb957c5eaecb639e35e034ad6b7c3293b9fdbf28f13ee19ba90c1db21e (1)

A examples/inference/custom.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-172c0630b62f9813b7ffc632b29f2837391b8f158f57f5e8573de01451d6c6e8 (4)

R examples/inference/example.ipynb https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-cc4d28a814ceb69f239f42ad4447de2f036e36a2a2b911906e589f24c6dcc584 (0)

R examples/inference/imagenet-21k-labels.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-29faffa354a225fe547566cc5509ed2cd67ba452be14b0f89c98242b72d1f079 (0)

R examples/inference/img.jpg https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-ece002ccbd79a38564e3704978b63e9e517ff41940eb43694bea0cb75474efe6 (0)

R examples/inference/img2.jpg https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-cf7faa5a2a6c8cf312865a681f6ee3ce2ce9e39dff6bafa9d65315a974124f2b (0)

A examples/inference/inference.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-6cd6dcb620f5f95e29f555da12fe660693ac726408c4ababd54035db8474170b (32)

R examples/inference/labels_map.txt https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-32c6e3a18f0db216786760ad6cc4f8fd979d2e5fa48cb6917739097e8a0f9620 (0)

R examples/inference/labels_map_21k.txt https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-87f76e9cc9dcea1712585f530d1c571678e0de9fe5d0dbc4ff88fb2ca77caac8 (0)

D examples/simple/imagenet.synset.obtain_synset_list https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-69ff0286487dd46039e9c54d493a58c4026a6df90580a6664ae8a710879857f1 (21843)

D jax_to_pytorch/README.md https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-42f11fc236ef2bacc5ad83c55ad78c0630d5037c300d9a4bafd5407bd55a24dd (6)

D jax_to_pytorch/convert.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-6cb4c29dab633c1b659afccbaccb5e0f53b6f68fabb803189fc77857cb1b8e3e (122)

D jax_to_pytorch/explore-conversion-21k.ipynb https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-6e07f046453fa8928c155e01f6e1d507a1284cf3ccf82e14fcb570739e284d3d (309)

D jax_to_pytorch/explore-conversion.ipynb https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-80aa1b3d9a01f3a8d14bca35395339ce343b15412db077c8caea7b135a9cb525 (357)

D jax_to_pytorch/jax_weights/download.sh https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-15a3ec5d4270d29271938f1417f06e896d667f42bf26bf424356bc3091fb2775 (28)

D jax_to_pytorch/weights/.gitkeep https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-7e585ca02dd189c4e7e4ed00eaafbe6d9e1d0556466ed42a8771f7aaf3e89e73 (0)

M pytorch_pretrained_vit/configs.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-c7e52ebc83d517a0f2e07f0d9f436b477fb479e3876c15025cb23f6ee32eb8ae (47)

M pytorch_pretrained_vit/model.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-40d5303425dd23145d8df5a0bcb8e4f297fd44320869f69b65e43a6f7b55674d (19)

M pytorch_pretrained_vit/transformer.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-0447c00029b03dcef95224ba651dd908fce0aa3ad47c8676ade9c126d00c81d9 (56)

M pytorch_pretrained_vit/utils.py https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7/files#diff-bd528682e76cf934ab1035c1df1bc4d68a92c23506077a3937ee6797a9fc434d (178)

Patch Links:

https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7.patch

https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7.diff

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lukemelas/PyTorch-Pretrained-ViT/pull/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFQ4MDLVFJHAZIV3FAXD73SWCEFTANCNFSM4VFRWC5A .

lukemelas / PyTorch-Pretrained-ViT

General changes and adding of support for more functionality #7