When we use the training_pl.py inside a docker container the cache files from the dataset are not working since they are dependent on the local path:
File "/usr/local/lib/python3.9/site-packages/datasets/utils/info_utils.py", line 40, in verify_checksums raise NonMatchingChecksumError(error_msg + str(bad_urls)) datasets.utils.info_utils.NonMatchingChecksumError: Checksums didn't match for dataset source files: ['https://drive.google.com/u/0/uc?id=0Bz8a_Dbh9QhbaW12WVVZS2drcnM&export=download']
I would suggest to save the train, val and test datasets in the processed folder as pickle files and load them with torch.load()
When we use the training_pl.py inside a docker container the cache files from the dataset are not working since they are dependent on the local path:
File "/usr/local/lib/python3.9/site-packages/datasets/utils/info_utils.py", line 40, in verify_checksums raise NonMatchingChecksumError(error_msg + str(bad_urls)) datasets.utils.info_utils.NonMatchingChecksumError: Checksums didn't match for dataset source files: ['https://drive.google.com/u/0/uc?id=0Bz8a_Dbh9QhbaW12WVVZS2drcnM&export=download']
I would suggest to save the train, val and test datasets in the processed folder as pickle files and load them with torch.load()