shyamsn97 / mario-gpt

[Neurips 2023] Generating Mario Levels with GPT2. Code for the paper "MarioGPT: Open-Ended Text2Level Generation through Large Language Models" https://arxiv.org/abs/2302.05981
https://huggingface.co/shyamsn97/Mario-GPT2-700-context-length
MIT License
1.11k stars 101 forks source link

mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The parameter here prompts an error. #20

Open 54457616 opened 1 year ago

54457616 commented 1 year ago
  1. TrainingConfig and MarioGPTTrainer cannot be used.
  2. mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The parameter here prompts an error. Is there no specific program for this training?

Can you give some specific suggestions?

屏幕截图 2023-06-30 232140 屏幕截图 2023-06-30 232151

shyamsn97 commented 1 year ago

Hey! What version of mario-gpt are you running? Can you try a pip install mario-gpt —upgrade?

54457616 commented 1 year ago

屏幕截图 2023-07-01 000046 屏幕截图 2023-07-01 000108 屏幕截图 2023-07-01 000116 屏幕截图 2023-07-01 000133 The PYTHON version used by the system is 3.10, and the corresponding program version is the latest download from the website. The first step ran smoothly without any problems, as shown in the figure, and the above error will appear in the second step.

shyamsn97 commented 1 year ago

Can I see the full stacktrace? Because from what I see above it looks like the error is coming from:

mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)

But below it looks like its working? Doesn't really look like an issue with the trainer / training config.

I ran the code in a new clean workspace:

>>> import torch
>>> from mario_gpt import MarioDataset, MarioLM, TrainingConfig, MarioGPTTrainer
>>> BASE = "distilgpt2"
>>> mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)
Using distilgpt2 lm
/home/shyam/miniconda3/envs/py39/lib/python3.9/site-packages/transformers/models/auto/modeling_auto.py:1352: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
  warnings.warn(
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at distilgpt2 and are newly initialized: ['transformer.h.0.crossattention.c_attn.weight', 'transformer.h.3.crossattention.c_attn.weight', 'transformer.h.4.crossattention.bias', 'transformer.h.5.crossattention.bias', 'transformer.h.2.crossattention.q_attn.weight', 'transformer.h.3.ln_cross_attn.weight', 'transformer.h.2.crossattention.c_proj.weight', 'transformer.h.2.crossattention.c_proj.bias', 'transformer.h.2.ln_cross_attn.weight', 'transformer.h.5.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.bias', 'transformer.h.0.crossattention.c_proj.bias', 'transformer.h.5.crossattention.c_proj.weight', 'transformer.h.5.ln_cross_attn.weight', 'transformer.h.3.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.weight', 'transformer.h.5.crossattention.c_attn.weight', 'transformer.h.1.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.weight', 'transformer.h.0.ln_cross_attn.weight', 'transformer.h.1.crossattention.bias', 'transformer.h.3.crossattention.bias', 'transformer.h.5.crossattention.masked_bias', 'transformer.h.5.crossattention.q_attn.weight', 'transformer.h.1.crossattention.q_attn.weight', 'transformer.h.1.crossattention.c_attn.weight', 'transformer.h.4.crossattention.q_attn.weight', 'transformer.h.0.crossattention.bias', 'transformer.h.3.crossattention.q_attn.weight', 'transformer.h.0.crossattention.masked_bias', 'transformer.h.4.crossattention.c_proj.bias', 'transformer.h.4.crossattention.c_attn.weight', 'transformer.h.2.crossattention.bias', 'transformer.h.0.crossattention.c_proj.weight', 'transformer.h.4.crossattention.c_proj.weight', 'transformer.h.2.crossattention.masked_bias', 'transformer.h.1.ln_cross_attn.weight', 'transformer.h.0.crossattention.q_attn.weight', 'transformer.h.4.ln_cross_attn.weight', 'transformer.h.4.crossattention.masked_bias', 'transformer.h.2.crossattention.c_attn.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using distilgpt2 tokenizer

Can you try doing a pip uninstall mario-gpt and then re run the python setup.py install step again?

54457616 commented 1 year ago

I will redeploy according to your suggestion, the main problem before is mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The second question is TrainingConfig, MarioGPTTrainer.

54457616 commented 1 year ago

屏幕截图 2023-07-01 215806

  1. The program has been uninstalled

屏幕截图 2023-07-01 223251

  1. The program is installed successfully

屏幕截图 2023-07-01 224932

  1. Sampling runs correctly

屏幕截图 2023-07-01 225520

  1. The front of Train is running normally

屏幕截图 2023-07-01 230040

  1. Error running after Train

屏幕截图 2023-07-01 230120 屏幕截图 2023-07-01 230143

shyamsn97 commented 1 year ago

Ah looks like accelerator changed their api. I’ll update it!

54457616 commented 1 year ago

Whether the relevant modification is completed, I look forward to your revision, I hope to continue to debug your results, thank you.

shyamsn97 commented 1 year ago

Should be fixed now! Let me know if you still have errors

54457616 commented 1 year ago

With your revision, my local operation can be successfully completed at present. Can you give a general description of the relevant files you have modified? And how can the content generated by training be better for fine-tuning the model? Before you provide the modification, after your reminder, I can run normally under the original code when I use the accelerate==0.16.0 previous version. 屏幕截图 2023-08-03 231650