charslab / TranscriberBot

TranscriberBot for Telegram
https://t.me/transcriber_bot
GNU General Public License v3.0
273 stars 38 forks source link

[feature request] Download transcription as text file (raw or in a subtitle format) #15

Open turicas opened 3 years ago

turicas commented 3 years ago

I had this idea during this use case:

  1. I needed to transcribe an 1-hour long interview in Brazilian Portuguese (the file was 80MB+ and was in M4A format)
  2. I split the original file in 5 OGG parts using ffmpeg (~8MB each part)
  3. I sent the files to the bot, received lots of 4k-chars messages in reply and copied to a text editor (this was boring)
  4. I found some errors when reading the text in the editor, but it was hard to find the error chunk in the audio file (so I could listen and fix it manually)

Being able to download the transcription in a text file will solve problem in item 3. Using a subtitle file format (like srt) would help a lot in item 4. The behavior of attaching the file could be triggered automatically for files longer than 1 minute.

I'm willing to implement this feature if the maintainers accept the proposal.

carloalbertobarbano commented 3 years ago

I think that having the option to generate a txt/subtitle file for longer audios is great. I would however leave the possibility to the user of choosing which mode they prefer (e.g. a command /mode <message/text/subtitle>).

From my side it is okay. If @stefanodelbosco does not have any issue (I'm guessing not), you can definitely work on it! :+1:

stefanodelbosco commented 3 years ago

Hi, I like the idea for the command '/mode' proposed by @carloalbertobarbano (message/text/subtitle) 👍 The result files should be generated in the "/data" directory and these will be deleted as soon as they are sent.

Remember that bots can currently send files of any type of up to 50 MB in size (using the https://api.telegram.org).

It will be possible in the future that TranscriberBot will use a custom bot api (https://github.com/tdlib/telegram-bot-api). With custom bot api you can Upload files up to 2000 MB.

For me is ok, you can work on it! 👍