Open RaymondUoE opened 7 months ago
Yes, such errors may happen, as models can have arbitrary arguments. What you suggest here sounds like a good solution when the calling side passes more parameters than the models accepts.
Moreover, there were plans to add a list of supported models to the documentation, which might also be useful here so that someone who encounters such an error, does not have to try model after model.
Bug description
Requires
query_strategy
to be a subclass ofEmbeddingBasedQueryStrategy
, such asEmbeddingKMeans
; Requirestransformer_model
to be a model that does not expecttoken_type_ids
in its forward function, such asdistilbert-base-uncased
Steps to reproduce
When performing active learning, the model has an unsupported input
token_type_ids
when creating embeddings.Expected behavior
The keys of model input are adjusted according to the specific models.
Cause:
In file
small_text/integrations/transformers/classifiers/classification.py
, function_create_embeddings
: the following code:need to be changed to
removing the
token_type_ids
field if the seed model does not expecttoken_type_ids
in its forward function.Environment:
Python version: 3.11.7 small-text version: 1.3.3 small-text integrations (e.g., transformers): transformers 4.36.2 PyTorch version: 2.1.2 PyTorch-cuda: 11.8