To enhance the feature set of our ai-network, we aim to implement a text-to-audio pipeline using the Stable Audio model by Stability AI. This new pipeline will open up exciting possibilities for media generation within our network, enabling users to convert text inputs into high-quality audio outputs. By adding this functionality π§, we are expanding the capabilities of the Livepeer AI network, allowing for a more comprehensive suite of tools that support creative and practical applications in media creation. π₯
We are calling on the community to help implement this crucial pipeline on the AI-worker side of the ai-subnet project. Integrating this pipeline will complement our existing pipelines, creating a more robust infrastructure for complete media generation. This will empower developers, content creators, and other stakeholders to seamlessly generate audio from textual data, thereby enriching the creative potential and utility of the ai-subnet. π
Implementation: Develop a working /text-to-audio route and pipeline in the AI-worker repository. This pipeline should be accessible on docker port 8008.
Functionality: The pipeline must accept text input as a prompt and return an audio file as output. Also, the parameters such as Duration, Diffusion Steps, and CFG Scale for the audio generation should be tuneable.
Scope Exclusions
This bounty does NOT cover the complete end-to-end implementation of this pipeline on the go-livepeer side, including payment logic and job routing. These aspects will be addressed by the AI SPE team or in a future bounty.
Implementation Tips
To understand how to create a new pipeline, you can refer to recent pull requests where new pipelines were added:
Utilize Earlier Work: There are implementations of stable audio in huggingface spaces, so review those works. This can provide valuable insights and a foundation for your work.
Utilize Developer Documentation: Check out our developer documentation for the worker and runner. These resources provide valuable tips for speeding up your development process by mocking pipelines and enabling direct debugging.
Generate OpenAPI Spec: Run the runner/gen_openapi.py file to generate the updated OpenAPI spec.
Generate Go-Livepeer Bindings: In the main repository folder, run the make command to generate the necessary bindings, ensuring your implementation works seamlessly with the go-livepeer repository.
How to Apply
Express Your Interest: Comment on this issue to indicate your interest and explain why you're the ideal candidate for the task.
Wait for Review: Our team will review expressions of interest and select the best candidate.
Get Assigned: If selected, we'll assign the GitHub issue to you.
Start Working: Dive into your task! If you need assistance or guidance, comment on the issue or join the discussions in the #developer-lounge channel on our Discord server.
Submit Your Work: Create a pull request in the relevant repository and request a review.
Notify Us: Comment on this GitHub issue when your pull request is ready for review.
Receive Your Bounty: We'll arrange the bounty payment once your pull request is approved.
Gain Recognition: Your valuable contributions will be showcased in our project's changelog.
Thank you for your interest in contributing to our project! π
[!WARNING]
Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.
Overview
To enhance the feature set of our ai-network, we aim to implement a
text-to-audio
pipeline using the Stable Audio model by Stability AI. This new pipeline will open up exciting possibilities for media generation within our network, enabling users to convert text inputs into high-quality audio outputs. By adding this functionality π§, we are expanding the capabilities of the Livepeer AI network, allowing for a more comprehensive suite of tools that support creative and practical applications in media creation. π₯We are calling on the community to help implement this crucial pipeline on the AI-worker side of the ai-subnet project. Integrating this pipeline will complement our existing pipelines, creating a more robust infrastructure for complete media generation. This will empower developers, content creators, and other stakeholders to seamlessly generate audio from textual data, thereby enriching the creative potential and utility of the ai-subnet. π
Required Skillset
Bounty Requirements
/text-to-audio
route and pipeline in the AI-worker repository. This pipeline should be accessible on docker port8008
.Duration
,Diffusion Steps
, andCFG Scale
for the audio generation should be tuneable.Scope Exclusions
Implementation Tips
To understand how to create a new pipeline, you can refer to recent pull requests where new pipelines were added:
Pull Request #96 Pull Request #103
Additionally, make sure to:
stable audio
in huggingface spaces, so review those works. This can provide valuable insights and a foundation for your work.runner/gen_openapi.py
file to generate the updated OpenAPI spec.make
command to generate the necessary bindings, ensuring your implementation works seamlessly with the go-livepeer repository.How to Apply
#developer-lounge
channel on our Discord server.Thank you for your interest in contributing to our project! π