lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
MIT License
2.39k stars 255 forks source link

Dataloader save #222

Closed LWprogramming closed 1 year ago

LWprogramming commented 1 year ago

temporary pr to address https://github.com/lucidrains/audiolm-pytorch/discussions/217#discussioncomment-6611113, although the fast-forward is really slow and will probably need some tweaking (so probably not merge just yet)

lucidrains commented 1 year ago

looks good! would simply add one more boolean (perhaps something like resume_from_last_seen_sample) so one can turn this off too, in the case they are already doing shuffled, multi-epoch training

LWprogramming commented 1 year ago

looks good! would simply add one more boolean (perhaps something like resume_from_last_seen_sample) so one can turn this off too, in the case they are already doing shuffled, multi-epoch training

Oops just saw this comment after taking a bit of a different approach. My thought was that, because a sound sample can be cut in many places to create an appropriately-sized input, the effective full batch size is really large and it's not totally certain we get through a full epoch consisting of all possible valid inputs given the raw audio data.

In that case I think it'd be ok to set a random seed that's dependent on the number of steps we're saving/loading at, so runs are both reproducible, while not restarting at the start upon load. Thoughts on this approach?

lucidrains commented 1 year ago

for it to be reproducible, you would have to save the steps at which you reload training as a list somewhere, and invoke setting the seed at the appropriate moment

lucidrains commented 1 year ago

maybe just forget about reproducibility for now

lucidrains commented 1 year ago

my hot take is, if you need to worry about seeds and reproducibility, your network is so brittle as to be useless to the general public (looking at you RL)

LWprogramming commented 1 year ago

Yeah, my thought process was mostly reproducibility as a debugging tool earlier on but I guess it's not as crucial now that the network actually trains (albeit with overfitting--I'll follow up in the original discussion)