Closed FJRFrancio closed 2 years ago
Hi, I just run the same test with lastest main branch code, everything looks fine. I think the problem could be simply resolved by pulling the lastest code.
Hi, I just run the same test with lastest main branch code, everything looks fine. I think the problem could be simply resolved by pulling the lastest code.
When I execute python setup.py install, this error occurs. gpt
Hi, I just run the same test with lastest main branch code, everything looks fine. I think the problem could be simply resolved by pulling the lastest code.
When I execute python setup.py install, this error occurs. gpt
Hi, could you please share more installation message, so that we could locate the problem?
Hi, I just run the same test with lastest main branch code, everything looks fine. I think the problem could be simply resolved by pulling the lastest code.
I tried the newest code, it works well when using default config files. But something goes wrong when I change the SEQ_LEN (in gpt2_2d.py) from 1024 to 2048. The error seems to happen at the all-reduce step.
I am running with Docker,the Dockerfile is the same as this.Dockerfile
Hi, I just run the same test with lastest main branch code, everything looks fine. I think the problem could be simply resolved by pulling the lastest code.
I tried the newest code, it works well when using default config files. But something goes wrong when I change the SEQ_LEN (in gpt2_2d.py) from 1024 to 2048. The error seems to happen at the all-reduce step.
The code is not working properly when SEQ_LEN>1024.
Hi, I just run the same test with lastest main branch code, everything looks fine. I think the problem could be simply resolved by pulling the lastest code.
I tried the newest code, it works well when using default config files. But something goes wrong when I change the SEQ_LEN (in gpt2_2d.py) from 1024 to 2048. The error seems to happen at the all-reduce step.
The code is not working properly when SEQ_LEN>1024.
The model has a default parameter max_position_embeddings=1024 and I forgot to change this. Now everything works fine.
Dear developers,
I am trying to run the gpt2_3d example but failed. It looks like the model didn't load the correct batch size. Hope to get some advice.
Thanks.
Error
Command
Environment
Error details