Open kevin-xuan opened 2 years ago
Hi as far as i remember, i can train the MT10 within 2~3days. MT50 is a bit longer around a week to get the full result. For the multitask part, we use multiple-processing and different tasks are using different process, and the batchsize is used for updating not sampling (if i remember it correctly)
Thanks for your reply! Could you please tell about the machine you used in the experiment? We use Intel XEON 4210R CPU and RTX 3090 GPU, the main time consumption is spend on CPU rather than GPU. Therefore, the reason may be that my CPU is not efficient?
I also encountered this issue. May I ask how you resolved it?
Hi, I'm interested in your work and appreciate the sharing of source code. I have some questions. First, I run MT10-Conditioned task, I find that the time consumption is average 200s per epoch, meaning that we need 18 days to perform all 7500 epoches. Moreover, I also run MT50-Fixed task, the consumption is average 2500s per epoch. And you use multiple-processing technique, even the policy network and Q-function network is deployed in GPU, these networks only consume 1.5G GPU memory. Is it normal training speed? Second, you use multiple-processing technique to collect data and perform multi-task learning, What is the training process of multi-task? Each time you input a state vector and task id one-hot vector into policy network, it means that the batch size is equal to 1, but you define the batch size as 1280, What is the specific training detail? Look forward to your reply, thanks!