Open flyflyly opened 1 month ago
Hi. Please provide the training log, your results, and your reproduction configuration so that we can help you locate the issue.
Thanks for your reply. Initially, I trained on a 3090 GPU for 1.5 days, probably because my training time was too short and I only trained 600k steps. These days I found a server equipped with two 3090 Gpus that I am retraining. But I find that the training speed seems to be a little slow. I'd like to know your experimental hardware configuration for training the model. Besides, I put my first training log in the attachment. train-1.tar.gz
I trained the model exactly according to your configuration parameters. This is my GPUs usage under training. What can I do to speed up my training process?
I trained for almost 7 days using a 3090 GPU, but I only reached 1.3M steps, which is quite different from the 10M steps mentioned in the paper. I really want to know your hardware configuration and the approximate training time. I kept some basic parameter settings from the code you provided. Did you use any other acceleration methods? I'm looking forward to your reply. Thank you!
Hello! Can you share your experimental hardware configuration? What kind of GPU do you use? How many GPUs do you use? How many hours have you been training? I would greatly appreciate it if you could provide training logs. Thank you!
I trained for almost 7 days using a 3090 GPU, but I only reached 1.3M steps, which is quite different from the 10M steps mentioned in the paper. I really want to know your hardware configuration and the approximate training time. I kept some basic parameter settings from the code you provided. Did you use any other acceleration methods? I'm looking forward to your reply. Thank you!
Hi. Our method relies more on CPU computation than GPUs since each process requires independent environment rendering and it's better to run it on a server with more than 128 threads. To accelerate the training process, you can try to increase the number of processes in the config (e.g. our baseline NIE set it as 80). Hope this is helpful.
Thanks for your reply. Initially, I trained on a 3090 GPU for 1.5 days, probably because my training time was too short and I only trained 600k steps. These days I found a server equipped with two 3090 Gpus that I am retraining. But I find that the training speed seems to be a little slow. I'd like to know your experimental hardware configuration for training the model. Besides, I put my first training log in the attachment. train-1.tar.gz
It seems the model is converging but a bit slow. You can try to warm up the training in merely 1-room scenes (as our paper suggests). Also, you can try to train our baseline PPO+intent first to see if everything works out fine since the training CaMP can be more complicated.
Thanks for your reply! Could you briefly describe the hardware you use? Such as the type of CPU and GPU, the size of RAM? I will try to train PPO+intent
to check if everything works.
I tried to train with code, but the model I trained couldn't get close to the results of the paper. Can you provide the pretrained model files you trained?