jina-ai / late-chunking

Code for explaining and evaluating late chunking (chunked pooling)
Apache License 2.0
244 stars 29 forks source link

I have got a bug just:AttributeError: 'BertModel' object has no attribute 'encode',where is the problem? #1

Closed NoobPythoner closed 2 months ago

NoobPythoner commented 2 months ago

AttributeError Traceback (most recent call last) Cell In[13], line 2 1 # chunk before ----> 2 embeddings_traditional_chunking = model.encode(chunks) 4 # chunk afterwards (context-sensitive chunked pooling) 5 inputs = tokenizer(input_text, return_tensors='pt')

File [~/anaconda3/envs/python3.12/lib/python3.12/site-packages/torch/nn/modules/module.py:1729]), in Module.getattr(self, name) 1727 if name in modules: 1728 return modules[name] -> 1729 raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")

AttributeError: 'BertModel' object has no attribute 'encode'

guenthermi commented 2 months ago

Most likely the error comes from the way how you load the model. If you load a Jina model with AutoModel.from_pretrained (don't forget trust_remote_code=True) it should have an encode function. Other models probably don't. However, loading models with SentenceTransformer should work as well. Nevertheless, you need a model that uses mean pooling, models that use CLS pooling don't work.