LibraryOfCongress / citizen-dj

The Unlicense
72 stars 17 forks source link

Search for a particular spoken word or phrase #23

Open camronlee opened 4 years ago

camronlee commented 4 years ago

Hi Brian,

I'm loving the Citizen DJ project! Thanks for the great work. I'd like to suggest an enhancement to consider for a future iteration.

User story: As a hip hop producer, I want the ability to search for a particular word or phrase across the collection of audio files, in order to find samples containing the word/phrase to use in a beat

Example: something like this: https://getyarn.io/

Not sure whether this is feasible, but if there were a way to perhaps use an open source speech recognition engine with support for keyword spotting, it would be nice to have this ability as a producer.

rhofour commented 2 years ago

I'm fairly certain this is feasible, but it would take some effort.

There's already a fair bit of pre-processing going on to the audio files so I think a step to extract words could fit in somewhere.

@beefoo Is there any hope of actually adding to the site at this point, or is it pretty much frozen now that you're done with your residency?

beefoo commented 2 years ago

@camronlee thank you for this suggestion! Yes, this makes a lot of sense for the oral history / spoken word collections; especially if you're looking for specific words or phrases to remix.

As @rhofour noted, this is a fairly common problem, but does require a fair amount of effort to do it well, particularly due to the fact that the spoken word audio (esp the older collections) have mixed audio quality and different dialects-- though I'd say the Joe Smith collection has pretty consistently clear audio.

@rhofour this site is more or less "frozen" at this point since it was funded for the duration of my residency which ended in 2020. That said, it is useful to continue to get feedback and collect use cases since it would help make the case for future support for the project!

camronlee commented 2 years ago

Sure thing, I can appreciate the work it would take to support a keyword search. Thanks for following up regardless! I'm very happy with this project in its current state and have used a lot of samples from Citizen DJ in my songs :)

On Tue, Nov 2, 2021 at 11:15 AM Brian Foo @.***> wrote:

@camronlee https://github.com/camronlee thank you for this suggestion! Yes, this makes a lot of sense for the oral history / spoken word collections; especially if you're looking for specific words or phrases to remix.

As @rhofour https://github.com/rhofour noted, this is a fairly common problem, but does require a fair amount of effort to do it well, particularly due to the fact that the spoken word audio (esp the older collections) have mixed audio quality and different dialects-- though I'd say the Joe Smith collection has pretty consistently clear audio.

@rhofour https://github.com/rhofour this site is more or less "frozen" at this point since it was funded for the duration of my residency which ended in 2020. That said, it is useful to continue to get feedback and collect use cases since it would help make the case for future support for the project!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LibraryOfCongress/citizen-dj/issues/23#issuecomment-958006829, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3CKNF5M5DECWTTUPC5JLLUKA2GNANCNFSM4NR6ZRMQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.