Closed valericac closed 1 month ago
We are currently not applying any preprocessing on the audio signal directly, so you should be set to compare/combine the features extracted with other libraries.
In terms of using the framewise
output directly you can certainly do so. Each frame is computed in a window of 10ms, so there are 100 frames in each second, regardless of audio FPS. For your information only a subset of the vocal acoustics measures we compute are done so at the frame level.
I am looking to extract several vocal features beyond the ones currently provided, based on the current literature on PTSD. To ensure consistency and comparability in feature extraction, I would like to better understand the pre-processing steps you apply. Specifically, I am interested in how I can apply the same processing for extracting features outside the current vocal acoustics set. While I have already cleaned the audio, I want to ensure there are no additional considerations I should be aware of when merging the features. Otherwise I think I can use the framewise output directly and apply my analysis to it.