jaketae / ensemble-transformers

Ensembling Hugging Face transformers made easy
MIT License
62 stars 5 forks source link

ValueError: Unable to auto-detect preprocessor class #1

Closed raquibaS closed 1 year ago

raquibaS commented 2 years ago

Describe the bug I am trying to use ensemble-learning according to your GitHub article on my project. However, I am getting following error:

To Reproduce I am getting error in last line when I tried to use EnsembleModelForSequenceClassification.from_multiple_pretrained(). Here is my code: import ensemble_transformers from ensemble_transformers import EnsembleModelForSequenceClassification

sentiment_pipeline = pipeline("sentiment-analysis", model = "finiteautomata/bertweet-base-sentiment-analysis", max_length=128, padding=True) sent_pipeline = pipeline("text-classification", model = "bhadresh-savani/distilbert-base-uncased-emotion", max_length=128, padding=True) text_pipeline = pipeline("text-classification", model = "textattack/distilbert-base-uncased-CoLA", max_length=128, padding=True) ensemble = EnsembleModelForSequenceClassification.from_multiple_pretrained(sentiment_pipeline, sent_pipeline, text_pipeline)

Error


ValueError Traceback (most recent call last) in () 5 sent_pipeline = pipeline("text-classification", model = "bhadresh-savani/distilbert-base-uncased-emotion", max_length=128, padding=True) 6 text_pipeline = pipeline("text-classification", model = "textattack/distilbert-base-uncased-CoLA", max_length=128, padding=True) ----> 7 ensemble = EnsembleModelForSequenceClassification.from_multiple_pretrained(sentiment_pipeline, sent_pipeline, text_pipeline)

4 frames /content/ensemble-transformers/ensemble_transformers/base.py in from_multiple_pretrained(cls, weights, *modelnames, **kwargs) 44 , suffix = class_name.split("For") 45 auto_class = f"AutoModelFor{suffix}" ---> 46 config = EnsembleConfig(auto_class, model_names, weights=weights) 47 return cls(config, **kwargs) 48

/content/ensemble-transformers/ensemble_transformers/config.py in init(self, auto_class, model_names, weights, *args, **kwargs) 41 except AttributeError: 42 raise ImportError(f"Failed to import {auto_class} from Hugging Face transformers.") ---> 43 preprocessor_classes = check_modalities(model_names) 44 if len(preprocessor_classes) > 1: 45 raise ValueError("Cannot ensemble models of different modalities.")

/content/ensemble-transformers/ensemble_transformers/config.py in check_modalities(model_names) 19 20 def check_modalities(model_names: List[str]) -> set: ---> 21 return set([detect_preprocessor_from_model_name(model_name) for model_name in model_names]) 22 23

/content/ensemble-transformers/ensemble_transformers/config.py in (.0) 19 20 def check_modalities(model_names: List[str]) -> set: ---> 21 return set([detect_preprocessor_from_model_name(model_name) for model_name in model_names]) 22 23

/content/ensemble-transformers/ensemble_transformers/config.py in detect_preprocessor_from_model_name(model_name) 14 continue 15 raise ValueError( ---> 16 "Unable to auto-detect preprocessor class. Please consider opening an issue at https://github.com/jaketae/ensemble-transformers/issues." 17 ) 18

ValueError: Unable to auto-detect preprocessor class. Please consider opening an issue at https://github.com/jaketae/ensemble-transformers/issues.

Additional context Please kindly help me to resolve this issue.

jaketae commented 2 years ago

Hello @raquibaS, thank you for opening this issue, and apologies for getting back to you this late. I'll take a look at it and keep you updated on this issue. Thanks!

raquibaS commented 2 years ago

Thank you for your kind reply @jaketae . Looking forward to hear from you.

raquibaS commented 2 years ago

Hi @jaketae. Any update on this issue?

jaketae commented 1 year ago

Hello @raquibaS, apologies for the late reply. I've finally found some time to work on this package.

It appears that you were initializing pipeline objects, then passing those into Ensemble Transformer. This isn't quite the intended way of initializing ensemble transformers. Instead of ensembling pipelines, you can directly ensemble the models via their model name (strings). Here is an example.

>>> from ensemble_transformers import EnsembleModelForSequenceClassification
>>> ensemble = EnsembleModelForSequenceClassification.from_multiple_pretrained("finiteautomata/bertweet-base-sentiment-analysis", "bhadresh-savani/distilbert-base-uncased-emotion", "textattack/distilbert-base-uncased-CoLA")
>>> ensemble(["hi", "there"])
EnsembleModelOutput(
        logits: [tensor([[-2.2536,  2.0728,  0.0350],
        [-1.9375,  2.1718, -0.5151]], grad_fn=<AddmmBackward0>), tensor([[-1.2181,  1.0694, -2.6189,  1.5066,  1.5004, -1.9101],
        [-0.6475,  0.7880, -2.0168,  1.5216,  0.6411, -1.5499]],
       grad_fn=<AddmmBackward0>), tensor([[-0.8730,  0.8156],
        [-0.8558,  0.8904]], grad_fn=<AddmmBackward0>)],
)

I've also updated the package to 0.0.2 with some miscellaneous improvements, so if you'd like, please give it a try! Thanks, and happy holidays.

jaketae commented 1 year ago

Closing this for now. Please feel free to reopen this issue or create a new one if you have further inquiries. Thank you!