felixbur / nkululeko

Machine learning speaker characteristics
MIT License
26 stars 4 forks source link

Adding audio_path to DATA section #62

Closed bagustris closed 10 months ago

bagustris commented 10 months ago

Currently, the filename in the database (CSV) must contain a full path instead of a basename only. In most cases, the provider of the dataset only provides a file with a list of the basenames (for platform independence). So, I would like to request adding audio_path to the DATA section in the INI file.

This can be optional, meaning, that if this option is not given, Nkululeko will search file path in the given CSV file (current behavior).

Example usage (see train.audio_path and dev.audio_path)

[DATA]
databases = ['train', 'test', 'dev']
train = ./data/ravdess/ravdess_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
train.audio_path = ./data/ravdess/ravdess_speech
dev = ./data/ravdess/ravdess_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
dev.audio_path = ./data/ravdess/ravdess_speech

One important note is that Nkululeko should be able to find audio files inside subdirectories of given audio_path since database creator sometimes also split their audio files into subdirectories instead of in a single directory.

Actually, I want to evaluate my experiment here without much effort with Nkululeko :)

felixbur commented 10 months ago

but you can have relative and absolute paths in nkululeko? relative meaning, it starts from the database root location. Isn't that sufficient?

bagustris commented 10 months ago

Relative and absolute paths are to define the database path (e.g, CSV file) not the content of file header inside the CSV file, right?

Suppose I have train.csv containing the following,

file, emotion
train_001.wav, fear
train_002.wav,  sad
train_003.wav, happy
...
train_100.wav, neutral

With the current configuration, is that possible to run nkululeko.nkululeko without any pre-processing?

Also currently there is target option for the labels, e.g., target=emotion. How Nkululeko could recognize audio path header in CSV file? I just assume all datasets use file as header. If this is the case, we also need to specify an audio header similar to target.

bagustris commented 10 months ago

I checked it can be accomplished to set the root directory under [EXP] section. It needs clarity to the INI file and and documentations.

felixbur commented 10 months ago

I checked it can be accomplished to set the root directory under [EXP] section. It needs clarity to the INI file and and documentations.

that's actually the root for the experiment results, not neccessarily the databases!

felixbur commented 10 months ago

And, yeah, documentation is still mainly my blog, that#s a weak point. So, you have to set the root directory (of the data) and you can specify the path to the audios from there, but if that would be useful, adding another key for an (optional) audio_path that would mean the path between database root and audiofiles root, would of course be no problem

bagustris commented 10 months ago

I am wrong when stating that defining root dir will solve this issue. I tried on TESS dataset (still in my branch), and it doesn't work. Still need the full path inside file header of the CSV file. So, the request of this feature is still valid.

This also will help if the dataset directory is not inside data Nkululeko directory (of course one can make softlinks for simple).

felixbur commented 10 months ago

ok, i add this later today. The data directory does NOT have to be inside the nkululeko root! ACtually a Nkululeko root is NOT necessary at all!

felixbur commented 10 months ago

done in 0.63.3 https://github.com/felixbur/nkululeko/blob/main/ini_file.md#data