Open Raion-Shin opened 1 month ago
After downloading the .tar.gz files, use the following command to combine the files into a single file: sh -c 'cat mbeir_images.tar.gz.part-00 mbeir_images.tar.gz.part-01 mbeir_images.tar.gz.part-02 mbeir_images.tar.gz.part-03 > mbeir_images.tar.gz'
Next extract images from the combined file: tar -xzf mbeir_images.tar.gz
It will not take 2.5 days. I was able to complete the whole process in just 10 hrs
I downloaded the .tar.gz file in https://huggingface.co/datasets/TIGER-Lab/M-BEIR, but it's really large and the
pv
command shows that I need 2.5 days to extract the file! Can you provide smaller zip files that package each dataset into a zip file? Thanks very much!