UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.33k stars 2.48k forks source link

Sentence Transformers / datasets dependency. #2898

Open samrandall-blai opened 2 months ago

samrandall-blai commented 2 months ago

If I have datasets as a locally available directory in my working directory and run import sentence_transformers, I get the following error:


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Caskroom/miniforge/base/envs/test_embeddings_env/lib/python3.12/site-packages/sentence_transformers/__init__.py", line 7, in <module>
    from sentence_transformers.cross_encoder.CrossEncoder import CrossEncoder
  File "/opt/homebrew/Caskroom/miniforge/base/envs/test_embeddings_env/lib/python3.12/site-packages/sentence_transformers/cross_encoder/__init__.py", line 1, in <module>
    from .CrossEncoder import CrossEncoder
  File "/opt/homebrew/Caskroom/miniforge/base/envs/test_embeddings_env/lib/python3.12/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py", line 18, in <module>
    from sentence_transformers.SentenceTransformer import SentenceTransformer
  File "/opt/homebrew/Caskroom/miniforge/base/envs/test_embeddings_env/lib/python3.12/site-packages/sentence_transformers/SentenceTransformer.py", line 27, in <module>
    from sentence_transformers.model_card import SentenceTransformerModelCardData, generate_model_card
  File "/opt/homebrew/Caskroom/miniforge/base/envs/test_embeddings_env/lib/python3.12/site-packages/sentence_transformers/model_card.py", line 33, in <module>
    from datasets import Dataset, DatasetDict, Value
ImportError: cannot import name 'Dataset' from 'datasets' (unknown location)```
ir2718 commented 2 months ago

Hi,

this could be due to sentence transformers relying on the datasets library. Try changing the name of your directory to something else and see if that works.

tomaarsen commented 2 months ago

@ir2718 is indeed right. #2858 and #2817 are related issues, and #2859 was created to resolve it (by checking if the datasets that can be imported is indeed the datasets published by Hugging Face).

samrandall-blai commented 2 months ago

Yes, #2859 is a a solution to this issue. Thanks!