Closed lhang33 closed 3 years ago
@lhang33 yeah it's using vanilla keras and keras is not really optimized. It requires twice the amount of the data in memory when using the .fit() function. You are totally right here. Using a generator will solve the problem. When I did it here, I had a machine with 32GB of memory and I was also using 32GB of swap. So it was okay for me but I was reaching the limits of my system. There is a lot of ways to improve it, especially when we want to use more speakers (or more data per speaker).
I have tested the softmax pre-training phase. I see the process contains all input data in the memory, which takes about 30GB in RAM. I wonder if there is a way to optimize it? for example, split the the mega file, "kx_train.npy" into pieces, and then read it by the generator? If this problem is solved, we can do pre-training on much larger dataset (maybe 5000 speakers, etc ) and maybe the performance still can improve.