tecwindow / SoundTranscriber

SoundTranscriber can be used to generate automatic transcription / automatic subtitles for audio/video files through a friendly graphical user interface.
https://tecwindow.net
19 stars 6 forks source link
extract soundtranscriber srt subtitles transcriber

Sound Transcriber

SoundTranscriber can be used to generate automatic transcription / automatic subtitles for audio/video files through a friendly graphical user interface. Developed by Mahmoud Atef, Ahmed Bakr, and Qais Alrefai from the TecWindow team.

Download:

download Sound Transcriber version 1.4.4.

Sound Transcriber User Guide.

This user guide aims to provide you with a comprehensive understanding of Sound Transcriber and help you make the most of its features.

We highly recommend reading this guide to ensure optimal usage of the program.

Introduction to Sound Transcriber:

Sound Transcriber is an accessible audio-to-text conversion program designed to transcribe audio and video files, it offers support for extracting subtitle files and more.

Developed by Mahmoud Atef, Ahmed Bakr, and Qais Alrefai from the TecWindow team.

Features:

Sound Transcriber offers the following features:

Planned features:

We have several planned features in the pipeline, including:

Supported services:

The software currently only supports online conversion using Google's speech recognition, OpenAi's Whisper, and Meta's wit.ai.

Important notes:

Please take note of the following important information:

Supported file extensions:

Sound Transcriber supports the following file extensions for conversion:

.mp3, .wav, .aac, .flac, .oga, .opus, .mp4, .avi, .mkv, .mov, .m4a, .ogg, .ram, .rm, .wma, .wmv, .3gp, .flv.

Obtaining API Keys:

Wit.ai:

If we were to include an API key within the program itself, it would likely be blocked after widespread usage by multiple users. Moreover, wit.ai provides distinct API keys for each language. This means that you need to create an application in the desired language and obtain its corresponding API key. Unfortunately, it is not feasible for us to gather API keys for all languages since they vary based on individual preferences. Therefore, we will provide you with instructions on how to obtain your own private API key. Although the following steps may appear extensive, they are straightforward and only need to be completed once.

You can repeat these steps and create a new application with a different name to obtain an API key for transcribing in another language. If you want to use multiple languages with wit.ai, simply repeat the steps to obtain an API key for each language.

Whisper:

Sound Transcriber supports transcription through the use of OpenAI's Whisper API keys, which are not available for free.

The pricing is based on the number of characters transcribed, and specific plans are not mentioned here.

To get detailed information about the limits and subscription options, please visit this page/. Keep in mind that signing in and adding a payment method is at your own risk.

To obtain an API key for Sound Transcriber, go to the API key page and click on "Create new secret key." Copy the generated key and proceed to add it in the program settings as demonstrated later.

Sound Transcriber Interface:

Upon opening the program, you will find an edit box displaying the transcribed result. Use the tab key to navigate through the other options.

The "Language" box allows you to specify the language of the file you want to transcribe. Select the appropriate language using the arrow keys.

Click the "Start" button to initiate the conversion process.

Next, you will find the "Save As" button, which allows you to specify the output saving preferences.

Below that, there is a read-only edit box indicating the path or the link of the file to be transcribed.

Use the "Browse" button to locate and select the file you want to transcribe.

Additionally, you can utilize keyboard shortcuts, which will be explained later.

Please note that the order of items on the screen may differ when navigating with the Tab key.

Menus

The program includes several menus accessible by pressing the Alt key.

File:

Services:

It contains the names of the services available for conversion, you can selecte any service.

Help:

Sound Transcriber Settings:

Similar to the NVDA screen reader settings, the Sound Transcriber settings are categorized into several sections, each containing various options. You can navigate between sections using the up and down arrows. Use the Tab and Shift+Tab keys to scroll through the options within the selected section.

General:

This section includes various program-wide options:

Save options:

The options in this section affect the saving functionality in the program's File menu.

If the AutoSave feature is disabled, the "Save" option in the "File" menu will perform the same function, saving files according to the specified extensions and path.

Google:

This section requires you to enter a secure API key in the provided text box named "Secret Key." You can click the edit button to modify the key. Additionally, you can adjust the segment duration by specifying the duration of each part of the file when using this service.

Note that the file needs to be divided into several segments for conversion. The maximum duration per segment for this service is one minute.

OpenAI:

Similar to the previous service, this section allows you to enter an API key. However, the maximum length for each file when using OpenAI is 30 seconds.

Wit.ai:

As Wit.ai separates languages based on API keys, this section allows you to combine languages as follows:

You will find a list of currently added languages.

Each language has a corresponding hidden edit field for the API key.

You can edit the key by removing the old one, pasting the new key, and then use the Edit button. Alternatively, you can use the Add button to add a new key.

Select the language matching your application in Wit.ai, paste the key, and click Add.

Repeat these steps for each language you intend to use. After obtaining a key from the Wit.ai site, return to the settings window to add it.

You can delete individual keys or all saved keys associated with this service using the provided buttons.

Choose audio format: To specify the file extension when converting, choose ogg or mp3. If your internet connection is bad.

Choosing wav will quickly split the file, but it will be larger.

Lastly, you can specify the duration of each file segment, ranging from 4 to 20 seconds. Choose the duration that yields the best results.

Press OK when you have finished adjusting the settings.

Keyboard Shortcuts::

Sound Transcriber provides several keyboard shortcuts to enhance speed and ease of use.

How to Convert Files:

To convert files, open Sound Transcriber and either browse for the file by clicking "Browse" or use the shortcut Ctrl+O. Alternatively, you can copy the file from your device and paste it using Ctrl+V.

You can also utilize the option available in the context menu for supported files, the Send To menu in Windows, or simply drag and drop.

You can alternatively copy a video link from sites such as Facebook, Twitter (X), Youtube, SoundCloud among others.

Choose the desired language and service using the provided shortcuts or adjust them in the settings. Press "Start" or use the shortcut Ctrl+Enter to initiate the conversion.

Did you know, you can open Sound Transcriber by pressing Windows + R to open the Run dialog, and then typing st.

Notes:

Report Bugs:

If you encounter any bug with Sound Transcriber, you can use the communication methods available in the "Contact Us" menu under the "Help" section. Provide a detailed explanation of the actions that led to the bug. We recommend sharing the Sound Transcriber.log file, which will assist us in understanding and resolving the bug more effectively.

Go to Settings > General and enable logging. Then repeat the steps that led to encountering the error. Don't forget to disable logging after sending the file. Note that keeping the option enabled may result in a large .log file. However, you can choose to keep the log enabled if you wish.

You can find the file in the following path:

AppData\Roaming\tecwindow\SoundTranscriber

Beta Updates:

Sound Transcriber offers a beta update system, allowing you to test new features and assist us in identifying bugs. While activating this feature is straightforward, there are some important points to consider:

To opt into beta updates, navigate to Settings > General, enable the "Include beta versions when checking for updates" option, and then search for updates.

If you wish to revert to stable versions, simply disable the same option and then download and install the latest stable version.

We extend our heartfelt gratitude to everyone who contributes to testing Sound Transcriber, finding bugs, and sharing their insights.

How to Translate:

While Sound Transcriber currently supports only a limited number of languages in its interface options and user guide, it can transcribe speech to text in a wide array of languages.

However, we warmly welcome anyone interested in translating the program into their native language.

Interface translation:

The translation of interface options primarily relies on .po files, which can be edited using the Poedit program. You can download Poedit from its official website, then navigate to the Sound Transcriber repository on GitHub or locate the program folder on your device. Next, locate the messages.pot file and open it in Poedit. From there, you can translate the strings into your preferred language, save the file (which will generate both a .po and a .mo file), and share these files with us.

Test translation:

Sound Transcriber can recognize and accommodate new translations, allowing you to test your translation before submitting it to us. To do this, navigate to the Languages folder, create a folder with the code for your language (the first two letters of the language), then create a subfolder named LC_MESSAGES, and place the .po and .mo files inside. Don't forget to name the files as SoundTranscriber.po and SoundTranscriber.mo.

Documentation translation:

While translation of the update log and user guide is not mandatory, you can translate them into your language using .pot files.

We use .md files to create .pot files, which we then send to translators. After that, we convert the .po files we receive from the translators back into .md files and then into .html files to include with the program.

This allows us to make corrections to any part of the manual and changelog, ensuring they are applied consistently across all supported languages. It also ensures that files in different languages remain in exactly the same format and structure.

To translate files, follow these steps:

notes:

Sound Transcriber website:

While there is no official website for Sound Transcriber, you can access all necessary resources in the Sound Transcriber repository on gitHub. This repository contains translation files and the latest version of the program.

Note: Sound Transcriber is not open source at this time, and the repository does not contain the source code for the program.

Repository link.

Contact us:

If you are unable to access our contact list within Sound Transcriber, you can reach us via email using the following addresses:

special thanks: