lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.
https://lhotse.readthedocs.io/en/latest/
Apache License 2.0
936 stars 214 forks source link

AudioTransforms are dropped when saving MixedCuts? #1367

Closed m-wiesner closed 2 months ago

m-wiesner commented 3 months ago

Perhaps this wasn't the intended usage of MixedCuts, but it seems that if you mix cuts to which audio transforms have been applied and then try to save the mixed cuts, the transform types (names) are dropped when they are transform objects instead of dictionaries.

In MonoCuts there doesn't seem to be a problem. I assume that this is because the transforms are saved in the Cuts that are stored in the MixedCut tracks and not in the transform field of the MixedTrack, which makes sense because each track could have different transforms applied to it, but I think it is causing a problem with serialization.

Specifically, when the object gets passed to the dataclasses asdict function, it doesn't seem to store the type of transform when the transform is specified as an AudioTransform object. It just stores the fields.

pzelasko commented 3 months ago

I forgot that mixed cut also has a transforms field now. We’ll need to add test coverage for this too. Do you have some time to fix? Otherwise I’ll do it but probably only after the next week.

pzelasko commented 2 months ago

Check #1370 for the fix