ConvLab / ConvLab-3

Apache License 2.0
107 stars 30 forks source link

[BUG] AttributeError: module '__main__' has no attribute '__spec__' #115

Closed AtheerAlgherairy closed 1 year ago

AtheerAlgherairy commented 1 year ago

I did train (t5-small) model on a dataset in Convlab unified format. But when comes to test ( --do_predict ) I got the following error:

Screen Shot 1444-06-11 at 11 55 22 AM

Screen Shot 1444-06-11 at 11 55 49 AM Screen Shot 1444-06-11 at 11 56 08 AM (

zqwerty commented 1 year ago

what is the version of datasets ?

AtheerAlgherairy commented 1 year ago

what is the version of datasets ?

It is a normalized version of sgd: normalized_sgd First, I normalized the domains and slots names of the original version of sgd. Second, I converted it to Convlab unified format using the preprocess.py (under sgd dataset folder). I already used this normalized_sgd version with JointBert and now I am trying to use T5, the training done and I have the model but I was trying to run the evaluation code..

zqwerty commented 1 year ago

I mean the python package datasets version. Is the training process successful?

AtheerAlgherairy commented 1 year ago

I mean the python package datasets version. Is the training process successful?

AtheerAlgherairy commented 1 year ago

I mean the python package datasets version. Is the training process successful?

I tried to repeat the training process on speaker=user but I got the same error..

Screen Shot 1444-06-14 at 12 11 29 PM Screen Shot 1444-06-14 at 12 12 02 PM

AtheerAlgherairy commented 1 year ago

Name: datasets Version: 2.7.1

AtheerAlgherairy commented 1 year ago

I deleted "spec" in main_mod_name = getattr(main_module.spec, "name", None) (From anaconda3\lib\site-packages\multiprocess\spawn.py file).

but I got another error when testing ( --do_predict ):

Screen Shot 1444-06-15 at 12 33 07 PM

zqwerty commented 1 year ago

I tried to repeat the training process on speaker=user but I got the same error..

I think one possible reason is your environment changed after your first training. Recently I trained a T5NLU model using the following setting:

I think you can re-installed the package with these versions. ConvLabSeq2SeqTrainer inherits from Hugging Face Transformers Trainer. So the version of Transformers may matter

AtheerAlgherairy commented 1 year ago

I installed the following packages:

Screen Shot 1444-06-18 at 12 02 21 PM

But still I cant do the evaluation for T5, I got the error: AttributeError: 'ConvLabSeq2SeqTrainer' object has no attribute '_max_length'

Screen Shot 1444-06-18 at 12 06 44 PM

AtheerAlgherairy commented 1 year ago

Can I set the max_length to 512 manually in the code to avoid this error?

zqwerty commented 1 year ago

That's really weird. In the source code of Seq2SeqTrainer [link], the self._max_length is assigned in evaluate and predict functions. In run_seq2seq.py, trainer.predict is called, so when prediction_step is called, the self._max_length should be assigned already.

May you paste the full error message?

Can I set the max_length to 512 manually in the code to avoid this error?

You can. But I think the problem you meet is so special, and finding out the real reason is better.

AtheerAlgherairy commented 1 year ago

The following are framework versions for the trained model in my output:

This is the full error message when I do the evaluation:

Screen Shot 1444-06-19 at 4 33 29 PM

Screen Shot 1444-06-19 at 4 33 47 PM

AtheerAlgherairy commented 1 year ago

The error also occurs for (_num_beams) attribute.

zqwerty commented 1 year ago

The reason is the version of Transformers 4.25.1 changes the functions of Seq2SeqTrainer, see https://github.com/huggingface/transformers/blob/main/src/transformers/trainer_seq2seq.py#L30

You can downgrade transformers version to be aligned with mine. In setup.py, we require 'transformers>=4.17.0,<=4.24.0',

AtheerAlgherairy commented 1 year ago

So I need to downgrade transformers version and repeat the training process?

zqwerty commented 1 year ago

I think if your previous training is success, you don't need to repeat

zqwerty commented 1 year ago

Are you using colab all the time? I think that will cause a unstable environment since it need to re-install package every time re-start

AtheerAlgherairy commented 1 year ago

No. Jupyter notebook.

AtheerAlgherairy commented 1 year ago

I downgraded the transformers to 4.20.1 and run the evaluation. I got the prediction results now and also the following error: Screen Shot 1444-06-20 at 12 03 33 PM

Screen Shot 1444-06-20 at 12 03 57 PM

Screen Shot 1444-06-20 at 12 04 27 PM

zqwerty commented 1 year ago

That doesn't matter. The function trainer.create_model_card is used to create a README.md automatically as a model card in the hugging face hub. The error raises from unrecognized model_name_or_path in hugging face hub (in your case, the output_dir is not a valid one)

AtheerAlgherairy commented 1 year ago

Many thanks..

zqwerty commented 1 year ago

You are welcome!