ethberlinzwei / Find-A-Team

Team formation repo - just file issues with ideas!
MIT License
13 stars 11 forks source link

Speech Recognition Infrastructure Services on Ethereum (DAO-backed Project) #28

Open chrishobcroft opened 4 years ago

chrishobcroft commented 4 years ago

Introduction

Computers can turn speech into text. It's sometimes called "Speech Recognition".

It takes a lot of previewing per and memory, to run some funky algorithms to transcode an audio file into binary then unicode.

This means that only big centralised monopolies and governments can play with it at scale. This in turn makes it a tool reserved for the wealthy, and not for the poor.

By drastically lowering the costs of running speech recognition software, it will empower independent software developers to use speech recognition in their applications. This will benefit real people with new tools which they have never seen before.

Idea description

This project is the seed of an idea to build a decentralised and distributed network of "Recognisers", which consume audio content, and create text, and are paid by "Broadcasters" for performing the work.

In some ways it can be likened to Livepeer's "Transcoders" which consume hi-res video content, and create lo-res video content. This makes the video more accessible to people who are watching on slow internet connections or 3-year-old smartphones, who still want to watch it live. It can also potentially be used to create automated subtitles on livestream video content.

"Transcoders" and are paid by "Broadcasters" for performing the work to resize the content. This provides the "Transcoders" with an income.

Specific Deliverable of the project: Draft White Paper v 0.1

Specific Deliverable of the project: Minimum Viable Implementation... and I mean Minimum

Research Topics

Skillset

Communication

Please leave a question in a comment below.

chrishobcroft commented 4 years ago

OK, some initial thoughts on the research topics

Does any software exist which already does this? What license is it released under?

How are they using mechanism design / tokens / staking to secure the network? How does the protocol reward early and consistent participation? How can this model be improved upon, what has been learned from experience so far?

What value is there to distributing a stake in the network, to lots of people? How do you mechanically distribute tokens in ways that don't breach any moral and legal principles. How to use tokens to incentivise the individuals who have the highest likelihood of collaborating to make the project eventually successful.

How to minimise the barrier to entry, so that anyone with an interest in contributing, can get some "skin in the game", without giving it away for free? How to include people from all continents in the development process, not just N. America / Europe?

What is the one key metric that we can find, and trust, and use to inspire us that this idea is worth pursuing? Are there any reasons why we should NOT pursue this project?

wslyvh commented 4 years ago

Also interested in this one.

Could add a learning App/component to it for people who want to learn and speak new languages. Or by using mechanical Turks.

gabrielfior commented 4 years ago

Hi, here is a reference to a project related to decentralized Speech Recognition:

https://medium.com/@IkishanShah/anryze-the-decentralized-speech-recognition-platform-4680b289c544

Not much technical details provided though.

Here is also another platform to build open source voice assistants: https://snips.ai

chrishobcroft commented 4 years ago

@wslyvh you're completely right.

@gabrielfior I also saw Anryze. Seems like a brand, perhaps worth collaborating.

chrishobcroft commented 4 years ago

Please visit the project's DAO, which will be used to govern this project.

To participate, you can:

Request a ESR voting token using the Token application.

Deposit and Withdraw ETH / Tokens using the Finance application.

Ask the DAO questions, using the Voting application.

MarcusJones commented 4 years ago

Hi, sounds interesting! I'm a data scientist with blockchain dev experience, I'm working on a general compute services orchestrated by ethereum smart contracts. How can we connect this weekend?

chrishobcroft commented 4 years ago

Hi @MarcusJones. I'm at Factory now. I'm very tall and wearing bright blue shirt and red shoes. Let's find each other.

mtagda commented 4 years ago

I will be there in around an hour so will ask you again then where are you :)

MarcusJones commented 4 years ago

Cool, I just got in, ocean protocol t shirt, you guys have a rough location?

On Fri, Aug 23, 2019, 17:38 Magdalena Trzeciak notifications@github.com wrote:

I will be there in around an hour so will ask you again then where are you :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ethberlinzwei/Find-A-Team/issues/28?email_source=notifications&email_token=AAGCXOPIE2KU2R6S2UQLFCDQF777BA5CNFSM4IN7F52KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5ASFFI#issuecomment-524362389, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGCXON3CG4ASP3UNAQWQSDQF777BANCNFSM4IN7F52A .

mtagda commented 4 years ago

I'm at Chainlink, wearing black&white skirt, maybe it will be easier if you look around and find me;)

MarcusJones commented 4 years ago

Sorry must have missed you! I'm at the entrance courtyard benches with a laptop :)

chrishobcroft commented 4 years ago

Hey @MarcusJones and @mtagda - are you around?

Let me know if you're interested in shaping up a mini plan

chrishobcroft commented 4 years ago

OK, here is some background reading, to understand some things related to the research topics:

For mechanism design / incentives mechanism, start here:

https://github.com/livepeer/wiki/blob/master/WHITEPAPER.md

For OS speech rec software:

https://fosspost.org/lists/open-source-speech-recognition-speech-to-text

For ideas on the distribution:

https://forum.livepeer.org/t/introducing-the-merklemine/204

https://medium.com/commonwealth-labs/whats-in-a-lockdrop-194218a180ca

https://github.com/ethberlinzwei/Find-A-Team/issues/19

chrishobcroft commented 4 years ago

I will happily give a deeper perspective on Livepeer's incentive and distribution if you would like.

Perhaps let's try to get something working in terms of a "dirty" prototype also. It should probably include a) something that turns audio into text, and b) some minimum viable smart contract interaction, e.g. "Recogniser" requests payment from "Broadcaster"