Closed h0lybyte closed 10 months ago
If we go the widget route, we could something similar to these
Where we could add them anywhere by calling the javascript file and using div
tags
Interesting video concept -> https://www.youtube.com/watch?v=iTZ2N-HJbwA
But in 24Hs is a bit rough, however we could implement that into the "future applications" concept.
A buddy of mine made this for post-video processing and it went viral but Seamless M4T didn't come out yet. Here's a link to his API in action:
https://twitter.com/therealprady/status/1680645510103977987?s=20
Resources for Seamless M4T:
Seamless M4T : https://github.com/facebookresearch/seamless_communication
Docker File: https://github.com/djyde/seamlessM4T-docker
Google Colab: https://github.com/camenduru/seamless-m4t-colab
ONNX Fast SeamlessM4T: https://github.com/fabio-sim/Fast-SeamlessM4T-ONNX
ONNX (Open Neural Network Exchange) : https://github.com/onnx/onnx
ONNX2RKNN Dockerfile : https://github.com/how2flow/docker-onnx2rknn
Hugginface/Colab: https://github.com/iakashpaul/seamlessly
Streamlit Demo: https://github.com/carolinedlu/seamlessm4t-streamlit
ATLAS/HQ Plan Frontend UI Starter:
High Level Design for SeamlessM4T Integration into ATLAS/HQPlan
async function textTranslation(data) {
// Implement translation logic using SeamlessM4T
return translatedText;
}
AUDIO TRANSLATION FUNCTION:
async function audioTranslation(data) {
// Implement translation logic using SeamlessM4T
return translatedAudio;
}
Implement n8n Workflow
Workflow Nodes
* Webhook Node: To trigger the workflow.
* Function Nodes: For parts of SeamlessM4T logic.
* Set Node: To set data.
* HTTP Request Node: To call Appwrite functions.
i.e.
{
"nodes": [
{
"type": "webhook",
// Configuration
},
{
"type": "function",
// Configuration
},
{
"type": "set",
// Configuration
},
{
"type": "httpRequest",
// Configuration
}
]
}
Integrate with HQPlanUI
* Use Appwrite SDK and n8n Webhook URL to integrate with 'hqplanUI-master'.
In ‘hqplanUI-master’
// Call Appwrite function for text translation
const translatedText = await appwriteSDK.callFunction('textTranslation', data);
// Trigger n8n workflow for audio translation
const translatedAudio = await fetch(n8nWebhookURL, {
method: 'POST',
body: JSON.stringify(data)
});
IMPLEMENTATION
Setup Appwrite and n8n
* Appwrite: Follow the [official documentation](https://appwrite.io/docs/getting-started) to set up Appwrite on your server.
* n8n: You can quickly get n8n running using its [Docker image](https://docs.n8n.io/getting-started/installation.html).
Design High-Level Architecture
* Use a drawing tool or whiteboard session to sketch the flow between Appwrite, n8n, and your main application.
Implement Appwrite Functions
* Go to your Appwrite console, navigate to the Functions section, and create new functions for text and audio translation.
* Deploy these functions using the Appwrite SDK or CLI.
typescript
Copy code
// Appwrite Function to translate text const translateText = async (text, sourceLang, targetLang) => { // Use SeamlessM4T API or library here return translatedText; };
Implement n8n Workflow
* Open the n8n interface and create a new workflow.
* Add a Webhook node to trigger the workflow.
* Add Function nodes to handle specific logic.
json
Copy code
// n8n Workflow JSON { "nodes": [ // Webhook and Function nodes here ] }
Integrate Agent Protocol
* Within Appwrite functions and n8n Function nodes, incorporate the Agent Protocol for secure data transfer.
Integrate with HQPlanUI
* Use the Appwrite JavaScript SDK and n8n webhook URLs to make function calls.
typescript
Copy code
// Inside your React component in hqplanUI-master const translatedText = await appwrite.callFunction("textTranslate", textData); const translatedAudio = await fetch(n8nWebhookURL, audioData);
Testing and Deployment
* Run unit tests to validate the functions and workflow.
* Deploy the changes and monitor for any issues.
HIGH LEVEL CODE DESIGN:
// 1. Appwrite Function for Text Translation (Deploy this using Appwrite Console)
const translateText = async (text: string, sourceLang: string, targetLang: string) => {
// Use SeamlessM4T API or library here
return translatedText;
};
// 2. Appwrite Function for Audio Translation (Deploy this using Appwrite Console)
const translateAudio = async (audioData: Buffer, sourceLang: string, targetLang: string) => {
// Use SeamlessM4T API or library here
return translatedAudio;
};
// 3. n8n Workflow (JSON representation; Import this into n8n)
const n8nWorkflow = {
"nodes": [
{
"type": "webhook",
"parameters": {
"url": "start-translation",
"responseCode": 200
}
},
{
"type": "function",
"parameters": {
"functionCode": `// Add SeamlessM4T logic here`
}
}
]
};
// 4. Integration with 'hqplanUI-master' (TypeScript code inside your React component)
import Appwrite from 'appwrite'; // Import Appwrite SDK
// Initialize Appwrite
const appwrite = new Appwrite();
appwrite.setEndpoint('YOUR_APPWRITE_ENDPOINT');
const textData = {
text: "Hello, world!",
sourceLang: "en",
targetLang: "es"
};
const audioData = {
audio: audioBuffer,
sourceLang: "en",
targetLang: "es"
};
// Call Appwrite function for text translation
const translatedText = await appwrite.callFunction("textTranslate", textData);
// Trigger n8n workflow for audio translation
const translatedAudio = await fetch("YOUR_N8N_WEBHOOK_URL", {
method: 'POST',
body: JSON.stringify(audioData)
});
The Appwrite Function can also serve as just a trigger
start
); then return documentID for the start.This way the n8n can update the status of the workflow dynamically.
Created the repo here -> https://github.com/KBVE/widget-seamless-m4t/
For now this is a temp repo, I am going to mess around with it a bit in the next couple hours.
Notes from Discord: Widget Idea - A simple button added in the audio/video channel on channel creation that would launch Seamless M4T as a serverless function; can be as simple as an iframe embed of the streamlit app that pops-up when the button/widget is clicked. We start with that and can add complexity/improve UI after, if we can get streamlit UI as a popup iframe inside of project application; working we can focus on getting the audio data/video translated after since that will probably require more backend work. At least for the 24 hour hackathon for the presentation we want a pop-up UI that we can record for a solid demo
I will close this issue ticket out! Hopefully we get a solid place for this competition!
Sucks we were not able to completely finish in time, but hey at least it was worth a shot! I learned a bit more about parcel 2 and the recent improvements that it has made. The next updates would be pushing finished widgets into the KBVE.com repo via a patch pull style, a bit similar to how Sweep does it.
Core Concept/Theory A clear and concise description of what the concept is. Ex. It would be cool if [...]
Idea
Utilize SeamlessM4T for building out language functionality for ATLAS/HQPlan
-OR-
Seamless M4T Widget/Discord Bot/ Chrome Extension
[Concept] : Seamless M4T - 24 Hour Hackathon
Hosted by 5-Dee Studios in partnership with KBVE - Official Team Page
Alternative Ideas Is there any other way this concept could be used?
Translate business communication across international clients and vendors, using Wav2Lip can be used to simulate realistic visually-augmented translations; using voice cloning AI we could potentially turn it into an API (?)
Alternative Examples/Sources Are there any other references that you can provide? A buddy of mine made this for post-video processing and it went viral but Seamless M4T didn't come out yet. Here's a link to his API in action:
https://twitter.com/therealprady/status/1680645510103977987?s=20
Additional information Add any other context or examples of this concept here.
Main focus is to turn a project/presentation in. Once we lock in an idea, @8gratitude8 will set up the slides ahead of time