CouncilDataProject / speakerbox

Speakerbox: Fine-tune Audio Transformers for speaker identification.
https://councildataproject.org/speakerbox
MIT License
51 stars 6 forks source link

Support different audio / directory structures #18

Open evamaxfield opened 1 year ago

evamaxfield commented 1 year ago

Feature Description

Support a more direct directory structure of speakers instead of "conversations". i.e.:

data/
├── bob/
|   ├── 0.wav
|   ├── 1.wav
|   ├── 2.wav
|   ├── 3.wav
|   ├── 4.wav
├── sally/
|   ├── 5.wav
|   ├── 6.wav
|   ├── 7.wav
|   ├── 8.wav
|   ├── 9.wav
└── eva/
    ├── 10.wav
    ├── 11.wav
    ├── 12.wav
    ├── 13.wav
    ├── 14.wav

Where all the audio for each speaker is provided as a directory. This would involve creating new functions for preparing the dataset -- with no guarantee that the "conversation id" holdout condition is met.

Use Case

See #17 -- direct use case already done.

Solution

Alternatives