same loss when running two experiments simultaneously under examples/libritts/cosyvoice

Hi, this is an amazing project. I am using this project, flow-matching, for training, but the input features are the outputs of a language model encoder that I trained myself. I modified the data loading process and part of the model structure. The entire code has been validated on my own data and can synthesize speech. However, I am currently encountering a strange training-related issue: I am running two experiments simultaneously under examples/libritts/cosyvoice with the only difference being the number of epochs (200 vs 1000). Each experiment has its own run.sh and YAML files, and I have modified the model and TensorBoard storage paths. The training data is in the same data directory, and I submitted these two experiments to different machines for 8-GPU training. Surprisingly, I found that their losses are exactly the same, even down to the decimal point. Why is this happening? However, when I diffed the models at the same epoch, there were differences.

Background:

Both experiments read the same data from the same disk location, where the data includes npy files that are loaded using numpy.load in the processor.

Thanks.

FunAudioLLM / CosyVoice

same loss when running two experiments simultaneously under examples/libritts/cosyvoice #662