LOBE is a recording client made specifically for TTS data collections. It supports multiple collections, single and multi-speaker, and can prompt sentences based on phonetic coverage.
We should calculate and display a precise duration per dataset. Currently a naive estimate is used. The duration of each recording can be calculated or estimated by:
If it has a verification, the duration equals the sum of trim period durations
If not, it equals the duration of the recording - 2 seconds (1 second for start and stop)
To efficiently access this, the total duration should be constantly updated and attached to the Dataset object.
We should calculate and display a precise duration per dataset. Currently a naive estimate is used. The duration of each recording can be calculated or estimated by:
To efficiently access this, the total duration should be constantly updated and attached to the Dataset object.