princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.37k stars 511 forks source link

training code error OSerror #250

Closed shyzzz521 closed 1 year ago

shyzzz521 commented 1 year ago

Supervised training: I always encounter an error when training my own simCSE with two columns of data. How can I solve this problem?

memory_mapped_stream = pa.memory_map(filename) File "pyarrow/io.pxi", line 1009, in pyarrow.lib.memory_map File "pyarrow/io.pxi", line 1009, in pyarrow.lib.memory_map File "pyarrow/io.pxi", line 956, in pyarrow.lib.MemoryMappedFile._open File "pyarrow/io.pxi", line 956, in pyarrow.lib.MemoryMappedFile._open File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status OSError: error stat()ing file OSError: error stat()ing file

Traceback (most recent call last):
File "train.py", line 597, in main() File "train.py", line 457, in main train_dataset = datasets["train"].map( File "/home/jovyan/zhangzhenzhong-zzz/condapy/envs/simcse_train/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 580, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/home/jovyan/zhangzhenzhong-zzz/condapy/envs/simcse_train/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 545, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, kwargs) File "/home/jovyan/zhangzhenzhong-zzz/condapy/envs/simcse_train/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3087, in map for rank, done, content in Dataset._map_single(dataset_kwargs): File "/home/jovyan/zhangzhenzhong-zzz/condapy/envs/simcse_train/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3513, in _map_single yield rank, True, Dataset.from_file(cache_file_name, info=info, split=shard.split)

github-actions[bot] commented 1 year ago

Stale issue message