Open BhashaBluff opened 10 months ago
hi there,
thanks for the question.
1/ what's your GPU setting? for model parallelization, you would need multiple GPUs in a single node.
2/ were you able to run inference? If so, does the result look good? Inference requires less computational resources but basically already implemented model parallelism.
I will add the model parallel training script soon.
-Yuan
Hey, Thanks for the prompt response. 1.) I was finetuning the model on on V100 station with 4 GPU of 32 GB. I was trying to finetune with device map = "auto" on line 126 of finetune.py in ltu_as. Although it gave a "NotImplementedError: Cannot copy out of meta tensor; no data!"I commented on the device_map line. I started fine-tuning, but it was giving an OOM error.
Done, please see LTU and LTU-AS. Your resources should be enough to train the model, remember to tune the micro_batch_size
to the max number that your GPUs can run.
Regarding the performance, if LTU, it should be exactly same with what we described in the paper, if LTU-AS, it might be a mismatch between training and inference GPUs. Also the model only takes input with 16kHz sampling rate and 10-second audio. You can check if your local inference result is similar to our online demo.
-Yuan
Hie, Thanks a lot.
You are welcome, please let me know if there's any issue. Remember to set micro_batch_size
larger, it can be something 16/32 or even larger.
Hie, Thanks for opensourcing this amazing work. Is there any parameter to parallize the model to run on smaller gpus. I was not able to find one in config. As suggested in readme "we should turn on model parallel to train on smaller gpus". Is there any config parameter for it ? Not able to find one.