Open wpm opened 4 years ago
can you run the same script with CUDA_LAUNCH_BLOCKING=1 at the start, I think that will give more information.
Any one getting this error.. see if you can run on a single gpu. In my case, I can run on a single gpu, but i get this same error when i use multiple gpus which use DataParallel models
Encounter the same issue here where I have more than one GPUs.
downgrade pytorch to 1.4.0 or modify the source code to not use self.parameters() in forward, you can save next(self.parameters()).dtype in init, and use the saved dtype in forward
Describe the bug I am trying to train the LayoutLM sequence labeling model as described in the LayoutLM README. Training fails with a
StopIteration
exceptionThe problem arises when using:
To Reproduce I set up my environment like so.
I preprocessed the example FUNSD data as described in the README then ran the following command.
I see the following error very soon after starting training.
Expected behavior I expect to train a model and have it created in the
FUNSD.layoutlm.model
directory. I am able to do this using the same setup on a different machine without a GPU.