weaviate / t2v-transformers-models

This is the repo for the container that holds the models for the text2vec-transformers module
BSD 3-Clause "New" or "Revised" License
40 stars 27 forks source link

Investigate persistent `unknown kwargs` log message #24

Open parkerduckworth opened 2 years ago

parkerduckworth commented 2 years ago

During vectorization, the message Ignored unknown kwarg option direction is repeatedly logged, seemingly once per vector created.

Investigate this issue, and see if it can be resolved.


Example screenshot:

Screen Shot 2022-03-07 at 4 21 06 PM
parkerduckworth commented 2 years ago

Found a related issue in the huggingface transformers repo. Some users reported that upgrading their tokenizers dependency to ~ v0.11.5 helped clear up this problem. This project relies on tokenizers==0.10.3

However, others reported that they had to downgrade their transformers dependency to v4.15.0 to fix it. This project relies on transformers==4.16.2, so I'm not sure if we want to do the same.

@antas-marcin any thoughts?

nzaw96 commented 2 years ago

@parkerduckworth How do I upgrade my tokenizer or transformer?

parkerduckworth commented 2 years ago

@nzaw96 for a specific version you would need to use pip/pip3 and specify the target version. For example:

pip3 install tokenizers==0.11.5 

If just wanting the latest release:

pip3 install tokenizers

You would likely want to update your requirements.txt as well to match the new version(s), if you have one.

chris-aeviator commented 2 years ago

Does this have any real consequences? I realized inconsistent ingests (ingest works fine, Get queries return only few results from many) and also see this message.

etiennedi commented 2 years ago

The current assumption is that this has no real consequences.

byronvoorbach commented 2 years ago

@parkerduckworth @etiennedi Every post to /vectors/ results in this warning getting logged. Are we planning to upgrade to a new transformers version any time soon?