sensein / senselab

senselab is a Python package that simplifies building pipelines for biometric (e.g. speech, voice, video, etc) analysis.
http://sensein.group/senselab/
Apache License 2.0
10 stars 3 forks source link

Features extraction #175

Closed fabiocat93 closed 2 weeks ago

fabiocat93 commented 1 month ago

This is an attempt to structure audio features extraction. This would contribute to the human phenotype vector with some low-level acoustic and audio quality descriptors.

Follow-up steps include:

codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 50.85714% with 344 lines in your changes missing coverage. Please review.

Project coverage is 60.24%. Comparing base (9b7209f) to head (1bfee15). Report is 73 commits behind head on main.

Files with missing lines Patch % Lines
...dio/tasks/features_extraction/praat_parselmouth.py 53.51% 225 Missing :warning:
...rc/senselab/audio/tasks/features_extraction/api.py 0.00% 40 Missing :warning:
...elab/audio/tasks/features_extraction/torchaudio.py 36.53% 33 Missing :warning:
...udio/tasks/features_extraction/torchaudio_squim.py 7.14% 26 Missing :warning:
src/tests/audio/tasks/features_extraction_test.py 85.54% 12 Missing :warning:
...health_measurements/extract_health_measurements.py 0.00% 5 Missing :warning:
...selab/audio/tasks/features_extraction/opensmile.py 57.14% 3 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #175 +/- ## ========================================== - Coverage 68.32% 60.24% -8.08% ========================================== Files 95 113 +18 Lines 3283 4017 +734 ========================================== + Hits 2243 2420 +177 - Misses 1040 1597 +557 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ibevers commented 1 month ago

@fabiocat93 is this going to incorporate Nick's features? Am I remembering correctly that he gave us the okay to incorporate them?

fabiocat93 commented 1 month ago

@fabiocat93 is this going to incorporate Nick's features? Am I remembering correctly that he gave us the okay to incorporate them?

there already

ibevers commented 1 month ago

Cool😎

fabiocat93 commented 2 weeks ago

@satra Quick update: I ...

fabiocat93 commented 2 weeks ago

@nickcummins41 I’ve added your name throughout the docstrings and general documentation to acknowledge your contributions to the audio feature extraction work. If you have any suggestions for more effective ways to give credit or improvements to the documentation, please let me know!

satra commented 2 weeks ago

thanks @fabiocat93 this is great for now. let's get this out and get some feedback on usage. i'm fighting a few fires so won't have time to do an in depth review. perhaps we can do that together sometime in a month or so across the entire codebase.

fabiocat93 commented 2 weeks ago

based on the last commit, i'm curious how running in series is faster than running in parallel at scale. doesn't mean pydra has to be used, but the change off hand doesn't make sense without some explanation on bottlenecks (memory, gpu, overhead, etc.,.).

pydra is still used under the hood. and I confirm that this newer implementation runs faster than previous one.

also it looks like this focuses on returning a single feature value per feature independent of the length of the audio. i.e. some features will not make sense over some duration, and some for less than some duration. we could say that should be left up to the user, in which case an example of splitting the audio into chunks would be good.

thank you for pointing this out. I agree with that. The way senselab is designed rn, we have some functionalities at the "tasks" level offering all kinds of customization and this is left to the user. the "workflows" level is one level higher in terms of abstraction and includes (/will include) all the best practices/work arounds/heuristics that you are referring to.

(relatedly, for speech like audio, do we have some utility or example for splitting audio based on other targets (e.g. sentences, vad, etc.,.), rather than time?)

we do, in the relative sections (vad, time-stamped transcripts, ...)

satra commented 2 weeks ago

@fabiocat93 - great job here. i finally had some time to go through this. i realize there are some inconsistencies in places, and instead of trying to be perfect, fix the easy ones, let's file some of those others as issues/discussions, and let's focus on optimizations in a different PR.

i'm hoping to update b2aiprep this weekend. let me know if you think a version could be released.