princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.36k stars 507 forks source link

No module named 'datasets' #181

Closed yiyiyi3124 closed 2 years ago

yiyiyi3124 commented 2 years ago

train.py: line 11: from datasets import load_dataset

Traceback (most recent call last): File "train.py", line 11, in <module> from datasets import load_dataset ModuleNotFoundError: No module named 'datasets'

yiyiyi3124 commented 2 years ago

train.py line 307: if extension == "txt": extension = "text" if extension == "csv": datasets = load_dataset(extension, data_files=data_files, cache_dir="./data/", delimiter="\t" if "tsv" in data_args.train_file else ",") else: datasets = load_dataset(extension, data_files=data_files, cache_dir="./data/")

I suppose it should be elif extension == "csv" ?

gaotianyu1350 commented 2 years ago

Hi,

The datasets package is in our dependency. You can also just manually install it by pip install datasets.