Closed pranauv1 closed 1 month ago
i got same problem,sometimes stuck.
I saw this in my output
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
[2024-09-10 21:43:44] [INFO] To disable this warning, you can either:
[2024-09-10 21:43:44] [INFO] - Avoid using `tokenizers` before the fork if possible
[2024-09-10 21:43:44] [INFO] - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
and it was stuck as well... I fixed it by setting the env var below
TOKENIZERS_PARALLELISM=false python app.py
I prepared the dataset through Flux Gym and then ran the Kohya script from the command line, it had the same warnings mentioned above but the training finished successfully.
I got stuck at 1st epoch, seems like a deadlock error. any ideas?
Logs: