Open DZ9 opened 2 weeks ago
Thanks for your interest in our work. I have not tested whether existing code can work with multiple machines, and the current distributional training is only tested with 8 gpus.
For setting up webshop servers on a different server, it should suffice if you set up the server on a different machine and make use of ssh port forwarding to tunnel the corresponding ports (e.g. 3000) on both machines. For speeding up webshop, you may find #6 helpful.
Thanks for your interest in our work. I have not tested whether existing code can work with multiple machines, and the current distributional training is only tested with 8 gpus.
For setting up webshop servers on a different server, it should suffice if you set up the server on a different machine and make use of ssh port forwarding to tunnel the corresponding ports (e.g. 3000) on both machines. For speeding up webshop, you may find #6 helpful.
This speedup method really helps, great thanks! May I know how many gpu hours will it cost to train webshop with gpt2 or mistral7B in your previous settings?
I don't remember exactly but it took around 3 days on 4xA5000 GPUs to train on webshop with gpt2. This was without the speedup method so I imagine it should be made faster with the speedup.
Thanks for the great work! I found this repo support distributional training by accelerate. But for webshop task it seems tricky to train it in a distributional way because of the extra server installation.
Here are my questions: