Closed rgxb2807 closed 8 months ago
@rgxb2807 added it quickly for you this morning - happen to be working on video trainer code
are you using the latest version with residual LFQ btw? i'm curious how well that is working!
You're incredible, thank you!
I just kicked off training locally of the course transformer using meta encodec. So far it's working well.
Forgive me if I'm missing something super obvious. I'm using the full LibriSpeech Coprus - they split the data into training and validation sets. I've combined all of the data into train and test data directories. Is there a way to specify how the data split occurs or preprocess the dataset such that a training and validation set are specified? It appears that only a random split is possible with
SemanticTransformer
,CoarseTransformer
andFineTransformer
.When training
SoundStream
, you can specify training and validation sets by passing training and test dataloaders via theget_dataloader
function (and first instantiating training and validationSoundDataset
instances).SemanticTransformer
,CoarseTransformer
andFineTransformer
don't allow separate dataloaders to be passed.For example in
CoarseTransformer
:https://github.com/lucidrains/audiolm-pytorch/blob/1b4d80f93a2cc0c9dd4797959afa613aac9d029b/audiolm_pytorch/trainer.py#L933C1-L953C1
And here's the example from
SoundStream
https://github.com/lucidrains/audiolm-pytorch/blob/1b4d80f93a2cc0c9dd4797959afa613aac9d029b/audiolm_pytorch/trainer.py#L237C1-L244C42Happy to raise a PR that copies similar logic as
SoundStreamTrainer
for the other 3 trainer classes