Open Marko-Stamenovic-Bose opened 6 years ago
I would be really happy if we could replace the command-line call-outs with proper libraries. Unfortunately, there aren't any replacements that match for quality or functionality, and I'd prefer to not have variable backends for everything.
That said, pyrubberband might get a direct cython implementation soon, which would cut down on most of the issues here. Sox is a different story though.
OK that's fair. Cython pyrubberband sounds pretty exciting! For drc
, what is a good way to objectively evaluate the quality of the transformation?
For drc, what is a good way to objectively evaluate the quality of the transformation?
I think this would depend on your eventual application. In most muda applications, the measure of "quality" that we care about is the hold-out accuracy of a model trained on the augmentation outputs, and that's pretty heavily abstracted from the drc process.
It just occurred to me that audiotk might be a good drop-in replacement for Sox. It's got a heavier dependency chain, and I haven't actually used it, but it seems plausible. Anyone feel like taking a crack at reimplementing the DRC class to see if it's worth pursuing?
Sure I'll take a look. I did have some headaches getting audiotk up and running, which is not a promising development, but I'll try to take another crack when I have a chance.
MUDA
relies heavily on external command line libraries such asrubberband
andsox
(lightly wrapped inpyrubberband
andpysox
) for core deformations such astime-stretch
,pitch-shift
anddrc
. These system library wrappers work by writing the transformed signal to disk and then reading it back from disk into memory (presumably to feed an ML algorithm).The external system call and particularly the additional read-write step introduce a large overhead in highly distributed/multithreaded out-of-core data pipelines. Would it not make sense to either a) allow an option to do an analagous deformation using in-memory python library (for example
librosa
) or b) replace the external system call altogether with an in-memory transformation?