RuntimeError when finetuning.

minimaxir / aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.

https://docs.aitextgen.io

MIT License

1.84k stars 218 forks source link

RuntimeError when finetuning. #193

Open wisplite opened 2 years ago

wisplite commented 2 years ago

I can get to the progress bar for finetuning, but no matter what, the process hangs with the below error, and the progress bar is stuck at 0.

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

I am on Windows 10 21H2, using Cuda 11.3.

I am finetuning the GPT-Neo 350M parameter model. I have attempted finetuning on both fp16 and fp32, the same thing happens.

wisplite commented 2 years ago

Interestingly, running this on my Linux machine doesn't result in this error, but it doesn't support fp16, so it immediately runs out of memory.

wisplite commented 2 years ago

Attempted again with the GPT-2 355M parameter model. It still failed with the same error. It appears to load the model, encode the tokens, and begin training, but before it does anything it loads the model a second time, encodes the tokens again, and fails with the same error.

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
  0%|          | 0/8000 [00:00<?, ?it/s]

Nosh-Ware commented 2 years ago

Looked around online after I had the same issue using the given example on windows, right after setting it up. Seems that you have to run some of this inside if __name__ == '__main__':. That fixed it for me, although for my specific setup I put all of the code inside that. Something about threads not behaving on windows?

phubner commented 1 year ago

I had the same issue and used the workaround in #154. Hope this helps

breadbrowser commented 1 year ago

I can get to the progress bar for finetuning, but no matter what, the process hangs with the below error, and the progress bar is stuck at 0.
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
I am on Windows 10 21H2, using Cuda 11.3.

I am finetuning the GPT-Neo 350M parameter model. I have attempted finetuning on both fp16 and fp32, the same thing happens.

set num_workers to 0