facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.48k stars 935 forks source link

Checksum of downloaded file does not match the expected checksum. #261

Closed tonytang731 closed 4 years ago

tonytang731 commented 4 years ago

❓ Questions and Help

We run our code on Google Colab and encountered this issue when we try to load the Hateful Memes Data. Have you encountered a similar problem before? Thank you!

!mmf_convert_hm --zip_file=./hateful_memes.zip --password=KexZs4tn8hujn1nK

apsdehal commented 4 years ago

Let us confirm the checksum on our side once again.

tonytang731 commented 4 years ago

Thanks! An image is also included for your reference.

image

apsdehal commented 4 years ago

Hey @tonytang731, I tested on my end and script seems to work fine. Can you provide me output of two commands so I can debug it further:

If second command doesn't work, please run pip install openssl first.

tonytang731 commented 4 years ago

The outputs are included as follows. Thank you very much for your time!

image

apsdehal commented 4 years ago

Your checksum is indeed different. Can you redownload the latest file and retry? The checksum should be a424c003b7d4ea3f3b089168b5f5ea73b90a3ff043df4b8ff4d7ed87c51cb572.

lauzadis commented 4 years ago

I am having a similar issue with downloading the visual_bert_pretrained_coco model.

AssertionError: [ Checksum for visual_bert.pretrained.coco.tar.gz from 
https://dl.fbaipublicfiles.com/mmf/data/models/visual_bert/visual_bert.pretrained.coco.tar.gz
does not match the expected checksum. Please try again. ]

SHA256 of the file is as follows:

openssl dgst -sha256 visual_bert.pretrained.coco.tar.gz 

SHA256(visual_bert.pretrained.coco.tar.gz)= 9d809f7aedd7eb596951e32eef1d45f2d25adc5624af52cd828daf103a33b203
apsdehal commented 4 years ago

@mataslauzadis The model was updated on our end. The latest PR #262 should fix this.

ironbar commented 4 years ago

I have the same hash on my computer: 84f15777d9b07f3d4885303e0964e08421aa21335fea2914f45c5b4d0ae40116, the file has been downloaded from DrivenData.

Downloading the file again yields a424c003b7d4ea3f3b089168b5f5ea73b90a3ff043df4b8ff4d7ed87c51cb572. My guess is that the dataset has been updated because the other file was downloaded a few days ago.

apsdehal commented 4 years ago

Not sure, how exactly you are downloading it, but I see same hash everytime. I will add a bypass_checksum arg soon. Checksum is for user protection itself, they should be allowed to bypass it.

Abhiruchi commented 3 years ago

I was also facing the same issue. Adding --bypass_checksum 1 worked for now.