finos / waltz

Enterprise Information Service
https://waltz.finos.org
Apache License 2.0
182 stars 129 forks source link

Davante's application #7137

Closed J-coder118 closed 2 months ago

J-coder118 commented 2 months ago

As a senior AI/ML Full stack engineer, I have been working for 10 years to develop AI-driven applications and system design architecture to ensure streaming and consistency between distributed components within the ML system.

I solved challenges in developing ML systems with advanced tech and implemented MLOps/DevOps best practices.

Especially, implemented advanced chunking strategy and FEA architecture in Graph RAG and Router system that can choose the system between semantic search, Graph RAG, Text-to speech and Elastic search.

Also, I have hands-on experience with developing AI agents with Langchain, Llama-index, downstream tasks such as developing Multimodal LLM, computer vision tasks, and Fine-tuning LLM using Tensorflow, python, and scikit-learn and Python backend framework.

I developed a valuable AI agent and bot for use cases for customized conversational AI solutions, domain-specific chatbots, text classification, language translation, question-answering personalized recommendation systems, and even healthcare and marketing automation applications.

With a deep understanding of Transformer architecture and speech recognition, I developed a fast, and extensible multimodal LLM that can understand text as well as human speech, without the need for a separate Audio Speech Recognition (ASR) stage. In this project, after research like AudioLM, SeamlessM4T, Gazelle, and SpeechGPTs, I have extended Meta's Llama 3 model with a multimodal projector that converts audio directly into the high-dimensional space used by Llama 3. This direct coupling allows this to respond much more quickly than systems that combine separate ASR and LLM components.

Innovatively, this multi-modal LLM can natively understand the paralinguistic cues of timing and emotion that are omnipresent in human speech.

Also, I focused on working with the advanced RAG system to improve the accuracy rate to 97% by integrating with various techniques at each step using Euclidean, and cosine similarity and I used best-fit techniques for developing agent use case., deep lake, Knowledge Graph, langGraph with RAG, and using ReRank, expending query, self query, and multi-vector-retriever...

I thoroughly monitor A/B testing and evaluate the experiments using Cuda-enabled Nvidia A100, Comet ML’s experiment tracker, and save the best model to Comet’s model registry. (deployed on Qwak, AWS) and deploy it as a REST API on Qwak.

Additionally, have extensive experience in managing and maintaining server infrastructure, cloud architecture, and automation for high-performance AI applications, specializing in building microservices and cloud-based solutions on AWS and Azure.

Let's discuss in more detail.