ThilinaRajapakse / simpletransformers

Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
https://simpletransformers.ai/
Apache License 2.0
4.1k stars 728 forks source link

model.predict() fails with DeBERTa #1417

Open fsiano opened 2 years ago

fsiano commented 2 years ago

Describe the bug

I noted a surprising behavior of model.predict() with DeBERTa. In particular, I successfully fine-tuned a DeBERTa-base model for regression with sliding windows and then attempted to use the fine-tuned model for prediction. I received the following error (which, I suspect, might be related to the treatment of sliding windows)

ValueError: could not broadcast input array from shape (6,) into shape (6,1)

Please note that I have utilized the same code with (similarly) fine-tuned BERT and RoBERTa models with no issues.

To Reproduce model = ClassificationModel('deberta','outputs/', num_labels=1, args={'regression': True,'sliding_window':True,'eval_batch_size': 6,'train_batch_size': 6,'max_seq_length': 512,'learning_rate': 1e-6, 'num_train_epochs': 1, 'reprocess_input_data': False, 'overwrite_output_dir': False})

predictions, raw_outputs = model.predict(df_test['text'][e].reset_index(drop=True))

Screenshots Screen Shot 2022-05-30 at 1 37 03 PM

Machine

azizcu commented 2 years ago

ValueError: could not broadcast input array from shape (8,) into shape (8,1)

Yes, I have also got the same error. Yeah, same settings working for BERT & RoBERTa but not Working for DeBERTa. This problem is got in only for regression task. However, the classification task working well. Please help!!! @ThilinaRajapakse

yyyang-2019 commented 2 years ago

same for me when eval_model wtih DeBERTa


ValueError Traceback (most recent call last)

---> result, model_outputs, wrong_predictions = model.eval_model(valid_df)

ValueError: could not broadcast input array from shape (8,) into shape (8,1)

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.