Open hbredin opened 2 years ago
This is a cool proposal! In Twitter I did mention we can model this task with audio-to-audio
and it would already work by outputting multiple audios. But having a nice custom widget more specific for the task would be very cool!
cc @mishig25 @julien-c WDYT?
This is very cool !
Definitely a good target for audio-to-audio as a starter (no widget needed).
audio-segmentation
seems like a good fit for what you're trying to do (does not exist yet, but should cover multiple use cases)
audio-token-classification
? 😱
audio-to-structured
?
not sure of the best new task type to keep some generality
But yeah could be cool to have it
audio-token-classification
? scream
You're actually pretty on spot on IMO, since token-classification
is actually text-segmentation
I think. It's also aligned with image-segmentation
.
Which basically should be a list of "objects" found in text
/audio
/image
+ some descriptor of "where" it is in the original input those objects are.
(audio and text are 1D with basically never non contiguous objects, so start
+ stop
are enough, IMO) in image because it's 2D, a full mask is basically required even for contiguous objects (boxes is also a simplification).
Btw audio-segmentation
(speech-segmentation
) existed and we deprecated it in favor of audio-to-audio
no @Narsil ?
speech-segmentation
was never deprecated, but it also never had widget support afaik.
It's output is not audio
so I don't see how audio-to-audio
could be used:
Nice!
Would you recommend we update this PR to speech-segmentation
then?
I think we can keep the PR as is, merge it when ready, so things are functional (even though less than perfect).
And when support for audio-segmentation
is ready (or even before), we can simply create a new PR.
Opening an issue as per @osanseviero's suggestion on Twitter. Issue imported from https://github.com/pyannote/pyannote-audio/issues/835
pyannote.audio 2.0 will bring a unified pipeline API:
where
output
is apyannote.core.Annotation
instance.I just created a space that allows to test a bunch of pipelines shared on Hugginface Hub but it would be nice if those were testable directly in their own model card.
My understanding is that two things needs to happen