Closed asusdisciple closed 10 months ago
Hmm, I am not too sure why that might be the case. If you initiate a keyboard interrupt while it is frozen, what is the stacktrace for where it is getting stuck? I don't see anything that would cause such a strange freezing behaviour. Does it only happen with batch size of 12, or any batch size other than 16?
Somehow it works now, seemed to be something temporary with the repo.
I encountered a strange bug or rather a strange behaviour, which I can not really pinpoint to the exact issue. I used the standard training, as you described and it worked fine. However when I changed the
batch_size
parameter to 12 inconfig_v1_wavlm.json
the train.py was only executed until line 136for i, batch in pb:
. Its not an memory issue as I still have more than 12GB free on my GPU but it seems for some reason the script skips the for loop if you increase the batch size in the json file.