kurianbenoy / Indic-Subtitler

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.
https://indicsubtitler.in/
GNU General Public License v2.0
73 stars 12 forks source link

Multiple Language output #11

Open snehas89 opened 5 months ago

snehas89 commented 5 months ago

It seems that when uploading an audio or video in Kannada, only the initial portion gets transcribed accurately, while the subsequent part is transcribed in Tamil, as depicted in the provided screenshot. This likely arises due to a language detection error or a system glitch.

Screenshot from 2024-04-12 18-38-02

ronald0098 commented 5 months ago

It seems that when uploading an audio or video in Kannada, only the initial portion gets transcribed accurately, while the subsequent part is transcribed in Tamil, as depicted in the provided screenshot. This likely arises due to a language detection error or a system glitch.

Screenshot from 2024-04-12 18-38-02

can u pls tell me step by step how to run this project in my machine

kurianbenoy commented 5 months ago

@snehas89 can you give us more details about the error. I have also noticed this issue. It's not new to us to be honest.

But if you provide more details like:

  1. Input Youtube video
  2. Input video language:
  3. Target video language:
  4. Did you use advanced options to use any of our 4 model's other that faster-whisper.

@snehas89 it will be helpful for us. Also @snehas89 did you want to help us with issue #9 ?

snehas89 commented 5 months ago

@kurianbenoy

  1. It was a local audio file which I uploaded
  2. Input video language: Kannada
  3. Target video language:Kannada
  4. Yes I did use 3 of the models provided i.e, SeamlessM4T, Faster-Whisper, WhisperX

    Out of the 3 models Faster-Whisper gave a result better than the other two. My primary aim was to transcribe the audio file and later look into translation, but was not able to proceed with it.

snehas89 commented 5 months ago

@ronald0098 I'm not sure if I found any documentation on how to run the model locally, I used the Indic subtitler web app https://indicsubtitler.in/ @kurianbenoy can confirm if this is right

kurianbenoy commented 5 months ago

Can you share the local audio file here if possible? @snehas89

We haven't added the documentation on how to run model locally, but yeah we can do that when we are free. Created an issue #13 for this.

snehas89 commented 5 months ago

@kurianbenoy doc.zip

Please find the attached zip file, as github doesn't support audio formats uploading

kurianbenoy commented 5 months ago

Thanks @snehas89 for sharing the files via zip files. We can't do much for the time being to be honest.

Yet in the future, we might work on improving accuracy with LLMs, so these multiple language outputs doesn't happen.