Open 54457616 opened 1 year ago
Hey! What version of mario-gpt are you running? Can you try a pip install mario-gpt —upgrade?
The PYTHON version used by the system is 3.10, and the corresponding program version is the latest download from the website. The first step ran smoothly without any problems, as shown in the figure, and the above error will appear in the second step.
Can I see the full stacktrace? Because from what I see above it looks like the error is coming from:
mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)
But below it looks like its working? Doesn't really look like an issue with the trainer / training config.
I ran the code in a new clean workspace:
>>> import torch
>>> from mario_gpt import MarioDataset, MarioLM, TrainingConfig, MarioGPTTrainer
>>> BASE = "distilgpt2"
>>> mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)
Using distilgpt2 lm
/home/shyam/miniconda3/envs/py39/lib/python3.9/site-packages/transformers/models/auto/modeling_auto.py:1352: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
warnings.warn(
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at distilgpt2 and are newly initialized: ['transformer.h.0.crossattention.c_attn.weight', 'transformer.h.3.crossattention.c_attn.weight', 'transformer.h.4.crossattention.bias', 'transformer.h.5.crossattention.bias', 'transformer.h.2.crossattention.q_attn.weight', 'transformer.h.3.ln_cross_attn.weight', 'transformer.h.2.crossattention.c_proj.weight', 'transformer.h.2.crossattention.c_proj.bias', 'transformer.h.2.ln_cross_attn.weight', 'transformer.h.5.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.bias', 'transformer.h.0.crossattention.c_proj.bias', 'transformer.h.5.crossattention.c_proj.weight', 'transformer.h.5.ln_cross_attn.weight', 'transformer.h.3.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.weight', 'transformer.h.5.crossattention.c_attn.weight', 'transformer.h.1.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.weight', 'transformer.h.0.ln_cross_attn.weight', 'transformer.h.1.crossattention.bias', 'transformer.h.3.crossattention.bias', 'transformer.h.5.crossattention.masked_bias', 'transformer.h.5.crossattention.q_attn.weight', 'transformer.h.1.crossattention.q_attn.weight', 'transformer.h.1.crossattention.c_attn.weight', 'transformer.h.4.crossattention.q_attn.weight', 'transformer.h.0.crossattention.bias', 'transformer.h.3.crossattention.q_attn.weight', 'transformer.h.0.crossattention.masked_bias', 'transformer.h.4.crossattention.c_proj.bias', 'transformer.h.4.crossattention.c_attn.weight', 'transformer.h.2.crossattention.bias', 'transformer.h.0.crossattention.c_proj.weight', 'transformer.h.4.crossattention.c_proj.weight', 'transformer.h.2.crossattention.masked_bias', 'transformer.h.1.ln_cross_attn.weight', 'transformer.h.0.crossattention.q_attn.weight', 'transformer.h.4.ln_cross_attn.weight', 'transformer.h.4.crossattention.masked_bias', 'transformer.h.2.crossattention.c_attn.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using distilgpt2 tokenizer
Can you try doing a pip uninstall mario-gpt
and then re run the python setup.py install
step again?
I will redeploy according to your suggestion, the main problem before is mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The second question is TrainingConfig, MarioGPTTrainer.
Ah looks like accelerator changed their api. I’ll update it!
Whether the relevant modification is completed, I look forward to your revision, I hope to continue to debug your results, thank you.
Should be fixed now! Let me know if you still have errors
With your revision, my local operation can be successfully completed at present. Can you give a general description of the relevant files you have modified? And how can the content generated by training be better for fine-tuning the model? Before you provide the modification, after your reminder, I can run normally under the original code when I use the accelerate==0.16.0 previous version.
Can you give some specific suggestions?