johnnymcmike / Gravital

A Discord AI Chatbot that uses GPT-2 and aitextgen for fast, believable responses that you can train on your own discord server's message history
MIT License
34 stars 5 forks source link

there is no pytorch_model.bin in trained_model #8

Open skedgyedgy opened 2 years ago

skedgyedgy commented 2 years ago

-- this might also just be me messing something up but as i'm following the tutorial i keep getting this error when i run main.py --train


  File "C:\Users\tsbui\Downloads\Gravital-1.2\Gravital-1.2\main.py", line 56, in <module>
    main()
  File "C:\Users\tsbui\Downloads\Gravital-1.2\Gravital-1.2\main.py", line 34, in main
    ai = ChatAI(togpu=True)
  File "C:\Users\tsbui\Downloads\Gravital-1.2\Gravital-1.2\Bot\ai.py", line 11, in __init__
    raise Exception(
Exception: You need to train the model first. Do this in colab or locally and make sure the finished model is in a folder called "trained_model".```

-- even though training the model was exactly what i was trying to do. i then got another error after creating a "trained_model" folder

```Traceback (most recent call last):
  File "C:\Users\tsbui\Downloads\Gravital-1.2\Gravital-1.2\main.py", line 56, in <module>
    main()
  File "C:\Users\tsbui\Downloads\Gravital-1.2\Gravital-1.2\main.py", line 34, in main
    ai = ChatAI(togpu=True)
  File "C:\Users\tsbui\Downloads\Gravital-1.2\Gravital-1.2\Bot\ai.py", line 13, in __init__
    self.gpt2 = aitextgen(model_folder="trained_model", to_gpu=togpu)
  File "C:\Users\tsbui\AppData\Local\Programs\Python\Python310\lib\site-packages\aitextgen\aitextgen.py", line 170, in __init__
    assert os.path.exists(
AssertionError: There is no pytorch_model.bin in /trained_model.```

-- which at first i thought meant i needed to get a pytorch_model.bin from somewhere, but i thought the whole point of the train function was to create that model?

-- i tried making a blank pytorch_model.bin though i doubt that would have worked either way, and the fact it's now asking me for a config.json file tells me i'm missing something
johnnymcmike commented 2 years ago

Huh, that's weird. Not to be cliche, but I just tested and It Works On My Machine ™️, and I can confirm that it is supposed to generate the trained_model folder and everything inside for you.

What you're seeing are the error messages for if you had run main.py just normally without training a model first. So that leads me to believe that it's not picking up your arguments for whatever reason. This is kinda dumb, but are you running main.py --train or python main.py --train verbatim in your console? Try python main.py --train because that's what I always do. Could also just try commenting out all the ifs and leaving only the code under elif args.train intact. Additionally, this could be a weird shell/cli issue, and if you're on Windows, training unfortunately probably won't work anyway for unrelated reasons. Library's issue, not mine.

All else fails, you can always train it in colab, or if you don't want to do that I'd be willing to train it for you sometime on my 3060.