Closed ShaneeyS closed 11 months ago
I also encountered this problem, it was caused by insufficient hard disk space.
The dataset is very big and unpacking it requires even more space. We recommend using a drive with 2 TB of available space (the final product takes up about 1.6 TB).
Hi, when i try to run the following command:
python utils/unshard_memmap.py --input_file ./pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00000-of-00082.bin --num_shards 83 --output_dir ./pythia_pile_idxmaps/
an error always raises:
pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00023-of-00082.bin 29%|?????????????????????????????????????????????????????????? | 24/83 [6:09:46<15:01:09, 916.43s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00024-of-00082.bin 30%|????????????????????????????????????????????????????????????? | 25/83 [6:25:14<14:49:06, 919.76s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00025-of-00082.bin 31%|??????????????????????????????????????????????????????????????? | 26/83 [6:40:51<14:38:46, 925.03s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00026-of-00082.bin 33%|?????????????????????????????????????????????????????????????????? | 27/83 [6:56:36<14:28:56, 931.02s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00027-of-00082.bin 34%|???????????????????????????????????????????????????????????????????? | 28/83 [7:12:14<14:15:21, 933.12s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00028-of-00082.bin 35%|??????????????????????????????????????????????????????????????????????? | 29/83 [7:28:12<14:06:25, 940.47s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00029-of-00082.bin 36%|????????????????????????????????????????????????????????????????????????? | 30/83 [7:44:13<13:56:16, 946.72s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00030-of-00082.bin 37%|??????????????????????????????????????????????????????????????????????????? | 31/83 [8:00:12<13:43:41, 950.42s/it]pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00031-of-00082.bin Bus error (core dumped)
Could you please tell me how to solve this problem?