PlanetRead / PR-Repository

5 stars 8 forks source link

[DMP 2024]: Auto Subtitler for Indian Languages #2

Open arvind-planetread opened 4 months ago

arvind-planetread commented 4 months ago

Ticket Contents

As part of the BIRD initiative , we aim to create a tool which can speed up the adoption of Same Language Subtitling (SLS) among the content producers for the entire country. This will ensure that 200M weak readers and 30M readers with accessibility to get regular reading exposure with content having SLS.

This tool will create SRT files by taking a video file and its text file. We aim for the tool to support the following languages : Tamil, Telugu, Kannada for now.

Goals & Mid-Point Milestone

Goal 1: Achieve 60% accuracy in timing accuracy of SRT files in Tamil Language. Achieve 60% accuracy in timing accuracy of SRT files in Telugu Language. Achieve 60% accuracy in timing accuracy of SRT files in Kannada Language.

Goal 2: Achieve 70% accuracy in timing accuracy of SRT files in Tamil Language. Achieve 70% accuracy in timing accuracy of SRT files in Telugu Language. Achieve 70% accuracy in timing accuracy of SRT files in Kannada Language.

Goal 3: Achieve 80% accuracy in timing accuracy of SRT files in Tamil Language. Achieve 80% accuracy in timing accuracy of SRT files in Telugu Language. Achieve 80% accuracy in timing accuracy of SRT files in Kannada Language.

Goal 4: Achieve 90% accuracy in timing accuracy of SRT files in Tamil Language. Achieve 90% accuracy in timing accuracy of SRT files in Telugu Language. Achieve 90% accuracy in timing accuracy of SRT files in Kannada Language.

The midpoint milestones will be completion of Goal 1 and Goal 2.

Setup/Installation

No response

Expected Outcome

The input will be a video file and its script in text file format. The text will be utf8 encoding. The output will be an SRT file with timecode for each line of the script.

Acceptance Criteria

We will use the VLC media player to check the time accuracy of the generated SRT file. This will be used to verify the completion of the goals too. We will use multiple video files to check if the tool is versatile.

Implementation Details

Python or any other technical stack.

Mockups/Wireframes

No response

Product Name

Auto Subtitler for Indian Languages

Organisation Name

Planet Read

Domain

⁠Education

Tech Skills Needed

Machine Learning, Python

Mentor(s)

@arvind-planetread

Category

Accessibility, Machine Learning

Abinash-bit commented 3 months ago

Hi there @arvind-planetread , The project problem statement is pretty clear and doable with in the timeframe of the of C4GT 2024. I am intrested in working on this project.

My Background : I am Abinash Mahapatra from Odisha University of Technology and Research, Currently pursuing Btech in Electrical Engineering (Final Year).

I have worked on NLP, Transformers, Various CNN Architectures such as LeNet, Alexnet, VGG16, VGG19, Resnet18, Resnet 50, Computer vision and currently interning as a Machine Learning Developer @ZCLAP INC.

I have a work Experience of 4 and Half months in the AI industry , and currently i am adapting with the pace of this growing AI field and its remarkable research areas.

I will start Working on Goal 1 and Goal 2 which will increase my probability of getting selected in this project.

Can you attach your discord id or email id so that we can contact.

Sayanjones commented 3 months ago

Hey @arvind-planetread I'm excited to contribute to your SLS subtitler project! My skills in Machine Learning and Python align well with your needs.

I'm eager to discuss how my skills and LLMs can benefit the project. LLMs could be a great asset for Script processing & content summarization, improving subtitle timing and conciseness. Then Future speech recognition integration for a fully automated system.

I'd love to discuss how I can contribute further. Could we schedule a meeting?

kansallipi commented 3 months ago

Hey @arvind-planetread , Auto Subtitler for Indian languages is a great initiative and I would love to contribute towards the same. Having previously worked with Trustin on their multilingual (Indian Vernacular Languages) NLP Bots I have a basic understanding of how we can go ahead with this implementation and would love to discuss these ideas with you!

ananya39mehta commented 3 months ago

Hello @arvind-planetread , I'm excited about the Auto Subtitler project for Indian languages and am eager to contribute my skills as an AI/ML Tech sophomore. With a background in machine learning and Python, I'm confident in my ability to assist in achieving the accuracy goals outlined, particularly in timing accuracy for SRT files in Tamil, Telugu, and Kannada.

I'm ready to jump in and work on Goal 1 and Goal 2 to advance the project within the specified timeframe. Please let me know how I can start contributing to this important initiative.

Looking forward to collaborating and making a meaningful impact!

brahmanshi commented 3 months ago

Hello sir @arvind-planetread , I am taking part in DMP 2024 program and i want to work under your project and guidance can you guide me how to proceed furthur?I am Brahmanshi seam final year btech CSE AI student with skills having like python, AI AND ML, data science etc.

shreyasdeodhare commented 3 months ago

Hello @arvind-planetread,

I'm absolutely buzzing about the Auto Subtitler project for Indian languages and bursting with enthusiasm to contribute my skills! As an AI/ML enthusiast with hands-on experience in training and developing ML models, I'm itching to dive into this project headfirst. With a solid background in machine learning and Python, I'm confident I can help smash those accuracy goals right out of the park.

Just point me in the right direction, and I'm ready to hit the ground running! Let's collaborate and make a real difference together.

Looking forward to making waves with you!

arvind-planetread commented 3 months ago

Hi @Abinash-bit , @Sayanjones , @shreyasdeodhare Thanks for expressing your interest and enthusiasm in this project. :) Please email me (arvind[at]planetread.org] your resume and a brief description on how you would be best person to execute this project with your skill set, experience etc. That will help me to review candidates for this project. Thank you all.

arvind-planetread commented 3 months ago

Hi @kansallipi , @ananya39mehta and @brahmanshi We are happy to have your interest in this project. As I had suggested in the comment above, please reach out via my email (arvind[at]planetread.org] to take this further. Looking forward to hearing from you all. Thank you all. :)

Abinash-bit commented 3 months ago

Hey there @arvind-planetread sir, I have mailed my resume, and a brief description and example of how to implement my skillset into this project, please go through them.

Please note that the approach i sent is an example of taking videos (from online apps) as Video file and convert the language to our own preferred language as subtitles , also i have used a pretrained model.

Further improvements in this would be taking a video file and a text file and than implement and train our own model.

Silent-ADARSH commented 3 months ago

Hello there @arvind-planetread , I am an AI enthusiast currently polishing my skills in this domain , I would also like to contribute like many other seniors here In this project as I have some experience within the domain of NLP, Text to Speech and Sppech to Text Conversion and alike .

KAMERAVAMSHI commented 3 months ago

Hey @arvind-planetread! I'm thrilled to join your SLS subtitler project and offer my expertise in Machine Learning and Python, which seem to fit perfectly with what you're looking for.

I'm eager to explore how my skills and LLMs can enhance the project. LLMs could significantly aid in script processing, content summarization, refining subtitle timing, and ensuring conciseness. Moreover, integrating future speech recognition technology could pave the way for a fully automated system.

I'm keen to discuss additional ways I can contribute. Can we set up a meeting to delve deeper into this?

arvind-planetread commented 3 months ago

Hi @KAMERAVAMSHI and @Silent-ADARSH Please write to me arvind[at]planetread.org with a proposal which has your resume, why you would be the best fit for this project and tentative project plan from your perspective. Thanks. :)

Abinash-bit commented 3 months ago

@arvind-planetread just need to clarify , something, i need to submit a short proposal or a final one , which i need to submit in unstop that i will submit?

Please go through the mail , and i am hoping that you will reply me soon.

AbhimanyuSamagra commented 3 months ago

Do not ask process related questions about how to apply and who to contact in the above ticket. The only questions allowed are about technical aspects of the project itself. If you want help with the process, you can refer instructions listed on Unstop and any further queries can be taken up on our Discord channel titled DMP queries.

bbk019238u commented 3 months ago

Hi @arvind-planetread, I am Bharath Kalyan, a pre-final year student at BITS Pilani.

I'm thrilled about the Auto Subtitler project for Indian languages and eager to lend my skills! With a strong background in CNNs, NLPs, ML and Python, I'm ready to dive in and help achieve your accuracy goals. Let's team up and make a real impact together!

pankaj8700 commented 3 months ago

Hi @arvind-planetread , i am a college student pursuing bsc with specialization in data analytics and i also want to contribute myself in this project, as i don't have any prior knowledge of open source, that's why please help me

skddl007 commented 3 months ago

Hi @arvind-planetread! I'm excited to be a part of your SLS subtitler project and offer my expertise in Machine Learning and Python. It seems like my skills align perfectly with what you're looking for.

I'm eager to see how I can use LLMs to improve the project, such as in script processing, content summarization, refining subtitle timing, and ensuring conciseness. Additionally, integrating speech recognition technology in the future could make the system fully automated.

I also have experience in newspaper text analysis for Indian languages, using NLP and Machine Learning models to extract insights.

I'm looking forward to discussing more ways I can contribute. Could we schedule a meeting to explore this further?

kartikeshwar156 commented 3 months ago

Hello @arvind-planetread, I am Kartikeshwar, a final year student at National Institute of Technology Tiruchirappalli.

I’m incredibly excited about the Auto Subtitler initiative for Indian languages and can’t wait to contribute my expertise! Possessing skills in NLPs, ML, and Python, I’m prepared to jump in and assist in reaching your precision objectives.

I have mailed my resume to your respective mail ID

Prateek-sinha-08 commented 3 months ago

I would love to contribute in this project of Machine Learning, to make a Subtitles for the Indian Languages(Tamil, Telugu, Kannada ) which will not only help me applying my ML knowledge in better use but can also expose me with great opportunities

arvind-planetread commented 3 months ago

Hi @pankaj8700 , @skddl007 , @Prateek-sinha-08

Here is an excellent article for proposal creation. Please refer to it and create your version of the proposal. Then email it to arvind[at]planetread.org

Thanks to you all.

ronitkumar98 commented 3 months ago

Hi @arvind-planetread, I am Ronit Kumar from Techno International New Town majoring in Artificial Intelligence and Machine learning currently in third year of my studies. I have done work on llms and computer vision and would like to contribute in creating the said project. I believe my previous work experience in NLP and computer vision will aid me in doing the project as well helping me explore new opportunities

Ishu2412 commented 3 months ago

Hi @arvind-planetread , I'm Shrayash Shukla, a third-year BTech student learning AI. My focus lies in NLP, and I've garnered experience through winning an international hackathon. Additionally, I'm currently involved in developing an AI startup tailored for children. I'm eager to offer my expertise and contribute to this project.

dudipalajothsna commented 2 months ago

Hello @arvind-planetread, I am Jothsna D , a pre-final year student at G Narayanamma Institute of Technology and Science. I am very much excited to contribute to this project. My skills involve Machine Learning (NLP). I have participated in few hackathons and I believe that Hackathon experience will help me here. Thank you.

Mubashirshariq commented 2 months ago

@arvind-planetread can we use any open source LLMs for doing this job,please reply as i would like to complete till goal 2 asap.

kartikeshwar156 commented 2 months ago

Hey @arvind-planetread , Actually I am creating a tool for this ticket but I need enough data to test the tool, could you suggest some websites from where I can have required data like videos in Telugu with their corresponding scripts or could you provide the necessary data resources for this tool .

druv9213 commented 2 months ago

Hey @arvind-planetread , Dhruv this side, a 1st year Btech student. my skill is java , python and machine learning and keen to contribute in this project. I will dedicatedly work on this project . _

arvind-planetread commented 2 months ago

@druv9213 Please submit your proposal to UnStop. Then I can review during May and get back on this. Thank you. 👍

Sufia-ahmad commented 2 months ago

Sir, I want to do this project, as a fresher.

arvind-planetread commented 2 months ago

@Sufia-ahmad Please email your resume to arvind[at]planetread.org Thank you.

AbhimanyuSamagra commented 2 months ago

Do not ask process related questions about how to apply and who to contact in the above ticket. The only questions allowed are about technical aspects of the project itself. If you want help with the process, you can refer instructions listed on Unstop and any further queries can be taken up on our Discord channel titled DMP queries. Here's a Video Tutorial on how to submit a proposal for a project.

Ranit-Bandyopadhyay commented 2 months ago

Hi, I am interested to work on the mentioned project

sahana-9314 commented 2 months ago

Hi @arvind-planetread, I wish to work on this project. Kindly let me know where to raise a PR, in my forked repository or in the main one.

PriyalPB commented 2 months ago

Hi @arvind-planetread! I'm a third year student from Cummins Pune.

I'm thrilled to join your Auto Subtitler project and offer my skill sets which has a strong background in Machine Learning ,deep learning (CNN), NLP and Python, which seem to fit perfectly with what you're looking for. I'm excited to explore how my expertise can elevate the project. Furthermore, the integration of upcoming speech recognition advancements could lead to a seamlessly automated system. I'm eager to discuss further avenues where I can make meaningful contributions. Could we schedule a meeting to delve into this in more detail?

KHUSHIPACHAURI commented 2 months ago

hello @arvind-planetread

Auto Subtitler for Indian Languages

My name is Khushi Pachauri. I am thrilled to contribute to this exciting project aimed at developing an AI-powered Auto Subtitler for Indian languages. As a passionate individual with a strong background in natural language processing, speech technologies, and multimedia applications, I believe I possess the skills and expertise to make valuable contributions.

My proficiency in Python, machine learning libraries, and speech processing frameworks positions me well to tackle the challenges associated with speech recognition, language translation, and caption timing adjustment. I am particularly intrigued by the prospect of working on cutting-edge techniques like transfer learning, data augmentation, and dynamic programming algorithms to optimize the system's performance. Throughout my academic and professional journey, I have gained experience in working with diverse datasets, preprocessing techniques, and model training methodologies, which will undoubtedly prove invaluable in the data preparation and iterative improvement phases of this project.

Sir let's embark on this exciting journey together, unlocking the power of multimedia content and fostering a more inclusive and culturally rich digital landscape.

arvind-planetread commented 2 months ago

@PriyalPB @KHUSHIPACHAURI Send your resume to arvind[at]planetread.org Thanks for you interest in the project. 👍

PriyalPB commented 2 months ago

sir can you please provide your exact mail ID

On Sun, 28 Apr 2024 at 05:40, Arvind @.***> wrote:

@PriyalPB https://github.com/PriyalPB @KHUSHIPACHAURI https://github.com/KHUSHIPACHAURI Send your resume to arvind[at]planetread.org Thanks for you interest in the project. 👍

— Reply to this email directly, view it on GitHub https://github.com/PlanetRead/PR-Repository/issues/2#issuecomment-2081258185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYKRXWOI6P66M6KOFB34NBDY7Q5BHAVCNFSM6AAAAABEMXQGGOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBRGI2TQMJYGU . You are receiving this because you were mentioned.Message ID: @.***>

cherrymekala commented 2 months ago

hi @arvind-planetread I was just amazed by seeing this project, as this is what I have been looking for a month.I'm a third year student very passionate in ML and Python. I have done some projects in them like face detection, visual Perception etc. I think my projects and enthusiam are useful for these kind of projects.I hope I will be able to work with you all amazing guys.Looking forward to discuss your expectations from me regarding this project.

abhishekjain96 commented 2 months ago

Hey there @arvind-planetread,

I'm thrilled about the Auto Subtitler project for Indian languages and can't wait to get involved. As an AI/ML Tech sophomore, I bring a solid background in machine learning and Python to the table. I'm particularly confident in my ability to help achieve our accuracy targets, especially in terms of timing accuracy for SRT files in Tamil, Telugu, and Kannada.

Count me in for Goal 1 and Goal 2—I'm all set to dive in and push the project forward within our timeframe. Just let me know how I can start pitching in to this important cause.

Excited to collaborate and make a real difference!

sanjanhaa-yenugu commented 2 months ago

Hello @arvind-planetread

Development of Auto Subtitler for Indian Languages

I'm Sanjanhaa, and I'm genuinely excited about this project. With a solid foundation in Python and machine learning, I'm eager to contribute to our goal of developing an AI-powered Auto Subtitler for Indian languages.

Furthermore, I've previously worked on a project where I built a chatbot using NLP, machine learning, and Python. This experience has honed my skills in natural language processing and machine learning, making me confident in my ability to make meaningful contributions to our project.

Sir,Let's team up on this exhilarating journey, combining our skills to unlock the vast potential of multimedia content and cultivate a digital environment that is inclusive and culturally diverse.

ayushi361 commented 2 months ago

Hello @arvind-planetread

Ayushi Mishra here from NIT Rourkela. I have experience in working with machine learning and AI in Python and After understanding the project I raised a PR for the project.

I would request you to go through it and share your valuable feedback and suggestions so that I can work further on it. I'm very eager and excited to work on this project so looking forward for your earliest response

Thank You Regards

Nivedita-MN18 commented 2 months ago

I came across the BIRD initiative and the project focused on creating an Auto Subtitler for Indian Languages, and I am incredibly excited about the impact it could have on improving accessibility to content for millions of individuals.

I am writing to express my keen interest in joining this project and contributing to its success. The goals outlined align closely with my skills and interests, particularly in machine learning and Python development.

I have experience working on similar projects in the past and am confident in my ability to contribute meaningfully to achieving the stated goals. Additionally, I am enthusiastic about the opportunity to work with a dedicated team and under your mentorship.

Could you please provide guidance on the next steps for joining the project? I am eager to learn more about the specifics and to begin contributing in any way I can.

Thank you for considering my application. I look forward to the possibility of working together towards a more inclusive and accessible future.

CodeSage4 commented 2 months ago

Hi @arvind-planetread ! Bhakta Varun here, an NLP and speech processing enthusiast ready to tackle the Indian Auto Subtitler challenge! My expertise in Python, ML libraries, and frameworks makes me a strong fit.Eager to apply transfer learning, data augmentation, and optimization techniques for top-notch subtitles.Excited to leverage my experience in data prep and model training for project success. Let's build a more inclusive digital world together!

19bhartisingh commented 2 months ago

Hi @arvind-planetread!

My name is Bharti Singh, and I'm eager to contribute to the Indian Auto Subtitler challenge. My expertise in Python, machine learning libraries, and frameworks makes me a strong candidate for this project. I also have an experience of 45 days in ML internship.

I'm particularly interested in applying transfer learning, data augmentation, and optimization techniques to ensure the highest quality subtitles. My experience in data preparation and model training will be crucial for project success.

I'm excited to collaborate with you and contribute to building a more inclusive digital world by making automotive content accessible to everyone.

anushkasaxena07 commented 2 months ago

Hello @arvind-planetread

I'm deeply interested in this project as it aligns perfectly with my passion for natural language processing and machine learning. Same Language Subtitling (SLS) is a powerful tool for enhancing reading exposure, and I'm excited about contributing to its adoption. My experience in machine learning and Python, along with projects like foul language detection models, have prepared me well for this endeavor. I'm motivated to be part of a project that has the potential to make a significant impact on accessibility and literacy . Apart from this i am also working on a research paper on foul language detection model.

WORK DONE SO FAR

I have made significant progress on the project. I created three issues to address key areas:

Issue : Explanation of Dataset and Output Folder Paths & Consistent File Name Handling Issue : Dynamic Frame Rate Calculation Issue : Improved Error Handling These issues aim to enhance the tool's functionality and usability.

After understanding the project I have raised a PR for the project.

I would request you to go through it and share your valuable feedback and suggestions so that I can work further on it. I'm very eager and excited to work on this project so looking forward for your earliest response

Thank You Regards

Jatayu-u commented 2 months ago

Hello @arvind-planetread,

I hope this message finds you well. My name is Shaurya Vats, and I am currently in my final year pursuing my undergraduate studies at the Indian Institute of Technology, Kharagpur (IIT Kharagpur). I am reaching out to express my interest in the project you've shared.

With a background in generative AI, I have authored three research papers and developed various applications in the field. After reviewing the details of the project, I am confident that I possess the necessary skills and ideas to make significant contributions to its success. I have also submitted a proposal on Unstop regarding my approach to the project.

I am eager to have the opportunity to contribute to this open-source project at the earliest convenience. Your consideration of my application is greatly appreciated.

Thank you for dedicating your time to review my message.

Best regards, Shaurya Vats

sreetejadaggu commented 2 months ago

Hello @arvind-planetread , I'm here to express my sincere interest in participating in this project. After carefully reviewing the project details and objectives, I am enthusiastic about the opportunity to contribute my skills and expertise towards its successful completion. I had an experience of 2 years in working with Machine learning related projects. I am confident that my involvement could add significant value to the project, and I am committed to delivering results of the highest quality. If there are any additional steps I need to take or if you require further information from my end, please let me know. I am available for discussions or meetings at your convenience to discuss this opportunity further. Warm regards, Sreeteja Daggu

Sufia-ahmad commented 2 months ago

I am Sufia, and I graduated with B.tech CSE, I am Data scientist and also full stack developer, but I am fresher I hv only completed 6 months of training in the entire field and one month of Internship so, I want to do the internship.

Priyanshuthapliyal2005 commented 2 months ago

hello @arvind-planetread sir sir i have sent you my proposal and resume . email : priyanshuthapliyal2005@gmail.com sir please review my proposal and suggest me ways to improve that proposal

vroy651 commented 2 months ago

Hello @arvind-planetread , my name is Vishal Roy , I'm currently in final of btech and pursuing computer science degree in Delhi Technological University , I have adequate knowledge of machine learning and deep learning techniques to solve the real world problems and I'm also have done lot projects in field of computer vision and NLP . I'm very passionate for doing something in AI , and have good grasp on Modern NLP techniques , I'm also worked on mutlimodel NLP projects like misinformation and disinformation, I have posses good potential to work with this open source project if possible please consider me to work on this project .

Regards, Vishal Roy

arvind-planetread commented 2 months ago

@vroy651 Thanks, I will review your proposal and if selected, I can notfiy you.