Closed adri1wald closed 1 year ago
This is definitely not the expected behavior, but also you have inverted the document and query models, it should be
model_type_or_dir = "naver/efficient-splade-V-large-doc" q_model_type_or_dir = "naver/efficient-splade-V-large-query"
the q_model_type_or_dir refers to the query encoder. and the other which is the default to the document encoder. Properly using the query encoder would fix the problem.
That being said I would not expect the document encoder to remove all values from this sequence. Note that you can expect SPLADE to trim some documents if it considers that they don't have significant content, but this would not be an example that I would expect it to remove
Hey @cadurosar cheers for pointing that out! I haven't encountered any zero-dim embeddings since.
In the notebook I made some modifications and I get back a zero-dimensional embedding. Specifically I wanted to see the bow representation of a quoted search query using the efficient-splade models. Is it expected for the model to sometimes return zero-dimensional embeddings? Without the quotes it generates an expected representation.