minimaxir / aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.
https://docs.aitextgen.io
MIT License
1.84k stars 220 forks source link

How do I use a Radeon GPU with AiTextGen? #169

Open btarg opened 2 years ago

btarg commented 2 years ago

Hi there, I just discovered this library after having a great experience using GPT-2-Simple and downloading trained models from Colab to generate stuff locally, but after running out of Colab free usage I want to try training my AI locally on my RX 5700. using CPU training is extremely slow for me on my Ryzen 5 3600X. When running the basic example code I am getting CUDA-related errors, so it looks like it can't tell I'm using a Radeon GPU when to_gpu=True is set as a flag.

How do I change the following bits of code I have to use my Radeon GPU?

train.py

from aitextgen import aitextgen

ai = aitextgen(tf_gpt2="124M", to_gpu=True)
file_name = "../out_discord.txt"

ai.train(file_name,
         line_by_line=False,
         from_cache=False,
         num_steps=3000,
         generate_every=1000,
         save_every=1000,
         save_gdrive=False,
         learning_rate=1e-3,
         fp16=False,
         batch_size=1, 
         )

generate.py


from aitextgen import aitextgen

# Load GPT2 locally (download from Google Drive)
ai = aitextgen(model_folder="trained_model", to_gpu=True)

generated = ai.generate(n=5,
            batch_size=1,
            max_length=64,
            temperature=1.0,
            top_p=0.9,
            return_as_list=True
            )

print(generated[0])```

Using Python 3.9.4
btarg commented 2 years ago

If there is no current support for AMD on Windows, this is definitely needed! Also, is it normal to have only complete gibberish generated by the Ai when training locally on CPU?

vlrkn commented 2 years ago

I'd be very interested in an answer as well!

humphreygaming commented 2 years ago

You could probably try using an AMD port / enabler of Tensorflow (ROCm), as Tensorflow only works on NVIDIA GPUs I believe. To my knowledge, CUDA is some sort of language that only NVIDIA hardware can use. AMD can use OpenCL, but it's not that same.

I'm not sure if this is how it works but you could try installing? ROCm and going from there. If any CUDA related errors appear, that's because of hardware. In that case, there's not much you can do other than getting an NVIDIA card or renting one from Google's Colab notebooks.

breadbrowser commented 2 years ago

use kaggle. google colab trash and google colab is kaggle but paid. kaggle has no paid plan ever and no ads. also better gpu's cpu's and tpu's.