lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.
https://lhotse.readthedocs.io/en/latest/
Apache License 2.0
935 stars 214 forks source link

How to split manifests into several parts #1334

Open OswaldoBornemann opened 4 months ago

OswaldoBornemann commented 4 months ago

How to split manifests into several parts ? I noticed that the cut file can be splited using split. Does the manifests has the similar operation?

pzelasko commented 4 months ago

There are two methods split and split_lazy defined on each manifest type https://github.com/lhotse-speech/lhotse/blob/4f014b13202c724d484e0471343053a261487b8a/lhotse/cut/set.py#L821-L882

Also accessible from CLI: https://github.com/lhotse-speech/lhotse/blob/4f014b13202c724d484e0471343053a261487b8a/lhotse/bin/modes/manipulation.py#L130-L215