MotazSabri / Hanami-release

Live translator that captures any audio that comes from a WINDOWS speaker or microphone and translates it to the desired language.
33 stars 7 forks source link

Save #17

Closed rappc87 closed 2 months ago

rappc87 commented 4 months ago

I would like to make some suggestions. If all this can be done, it will be a super app.

1- We should be able to adjust the size of the translation screen. 2 - It should keep translating and adding to the end in the background while looking at the history of translated words. 3 - We should be able to look at old translated words by scrolling up and down, not by scrolling left or right. 4 - The latest translated words should be in large font and the old ones in small font. 5 - Instead of google translate, should use AI-supported translations like deepl. 6 - The last Language and target language options should be remembered.

MotazSabri commented 4 months ago

Thank you for your valuable feedback, @rappc87. I appreciate your suggestions, and I'm pleased to inform you that many of them are addressed in the latest version of Hanami.

Point 1:

As referenced in issue #16, users can adjust the application size and font size via the HanamiGUI\Assets\Interface_structure.json file. For your reference:

"app": {
    "sub_translationFont": 11,  // Sub Text Fonts: Modify the fonts used for subtext in Mixture Mode and ConTrans Mode.
    "translationFont": 12,      // Text Fonts: Adjust the fonts used for the main application services, such as translation and transcriptions.
    "gpt_font": 10,             // GPT Window Font: Change the font in the GPT window, which is smaller than the main app screen and uses a smaller font by default.
    "default_src_language": "Japanese ☆",  // Set the default source language to one of the supported languages (See project page). If an unsupported language is set, Japanese will be used as the default.
    "default_tgt_language": "English ☆",   // Set the default target language to one of the supported languages. If an unsupported language is set, English will be used as the default.
    "sample_rate": 8000,        // Sample Rate: Control the audio quality for transcription and translation services. A higher value means better results but at a slower rate.
    "Compact": {
        "w": 545,
        "h": 185                // Application Resolution: Manage the app's height when the top controls are not visible.
    },
    "Extended": {
        "w": 545,
        "h": 205                // Application Resolution: Manage the app's height when the top controls are visible.
    }
},
"GPT_window": {
    "Extended": {
        "w": 321,               // GPT window Resolution: Manage the GPT window width regardless of application control visibility.
        "h": 500                // GPT window Resolution: Manage the GPT window height regardless of application control visibility.
    }
}

Point 2:

The app continues translating in the background while in navigation mode, appending the latest text so you can view it via the navigation controls. Navigating pauses the automatic transition but does not halt backend services.

Point 3:

Thank you for your suggestion. The app's horizontal layout was designed based on user feedback, where sideways navigation was found to be more suitable for our design.

Point 4:

As mentioned in point one, both the ConTrans and Mixture services use two text sizes: old and new. These font sizes are adjustable via the Interface_structure.json file.

Point 5:

The application uses DeepL for languages marked with a ☆, indicating superior translation quality. Languages without a ☆ use Google Translate. We continually monitor and update the translation methods as better quality options become available.

Point 6:

A configuration has been added to the Interface_structure.json file to remember the default language settings. Ensure the language value is set correctly as shown on the project page.

I hope this addresses all the points you mentioned. Thank you again for your feedback and for helping improve Hanami.

rappc87 commented 3 months ago

1 - I think it would be much easier and faster if you could add a settings tab and change the font size from there.

2 - Again, if the vertical option comes to the settings section instead of horizontal, you leave the choice to the user and it would be better.

(If the previous sentences appear above the current sentences, we will not miss the conversation and a better communication will occur.)

3 - It would be great if there was a possibility to show both the translated version and the original text at the same time.

Actually, it would be great if the intended use was like this;

You are in an online meeting and they speaks a different language. If there are people in your company who don't speak that language, you can show both the original and translated version of the text so that users don't miss sentences.

Just like in music lyrics videos, if the current sentence is shown in uppercase while the previous translated sentence is shown in lowercase on the top line, it makes it easier to re-read the missed sentence.

Or with another option it shows only the translations but when saving as txt it saves both the original language and the translation.

MotazSabri commented 2 months ago

Thank you for your detailed feedback, @rappc87 . Based on your suggestions, I have implemented a control panel in the latest version to enhance the user interface with the following features:

  1. Default Source Language: Set the default source language from the supported languages list.
  2. Default Target Language: Set the default target language from the supported languages list.
  3. Translation/Transcription Fonts: Adjust the fonts used for the main translation and transcription services.
  4. Sub Text Fonts: Modify the fonts used for subtext in Mixture Mode and CoTrans Mode.
  5. GPT Window Font: Change the font in the GPT window, which is smaller by default than the main application screen.
  6. Layout: Control the app's layout theme, allowing users to choose between dark or light modes for better readability.
  7. History Navigation: Adjust the position of the history bar, enabling users to read previous translation messages within the same session.
  8. Application Resolution: Manage the application's height, with the width remaining fixed.

Additionally, the control panel includes an option to position the history navigation bar vertically or horizontally, addressing your request for user choice in layout.

Regarding your suggestion to display both the translated version and the original text simultaneously, this feature is already available in Hanami for both transcription and translation services (referred to as the Mixture service). Moreover, the CoTrans feature displays both the previous and current translations concurrently. In these modes, the first text's font size is controlled by the sub-translation font setting, while the second text's font size is controlled by the main translation font setting.

For saving text, please note that the 'save to text' function only saves one form of text (either the translation or the transcription). The additional texts displayed in Mixture and CoTrans modes will not be saved. This is because the output text is not formatted and it will be confusing for the reader to distinguish between various modes if the content is large.

I hope these updates improve your experience with Hanami. Thank you again for your valuable feedback.

rappc87 commented 2 months ago

Thank you for your detailed feedback, @rappc87 . Based on your suggestions, I have implemented a control panel in the latest version to enhance the user interface with the following features:

  1. Default Source Language: Set the default source language from the supported languages list.
  2. Default Target Language: Set the default target language from the supported languages list.
  3. Translation/Transcription Fonts: Adjust the fonts used for the main translation and transcription services.
  4. Sub Text Fonts: Modify the fonts used for subtext in Mixture Mode and CoTrans Mode.
  5. GPT Window Font: Change the font in the GPT window, which is smaller by default than the main application screen.
  6. Layout: Control the app's layout theme, allowing users to choose between dark or light modes for better readability.
  7. History Navigation: Adjust the position of the history bar, enabling users to read previous translation messages within the same session.
  8. Application Resolution: Manage the application's height, with the width remaining fixed.

Additionally, the control panel includes an option to position the history navigation bar vertically or horizontally, addressing your request for user choice in layout.

Regarding your suggestion to display both the translated version and the original text simultaneously, this feature is already available in Hanami for both transcription and translation services (referred to as the Mixture service). Moreover, the CoTrans feature displays both the previous and current translations concurrently. In these modes, the first text's font size is controlled by the sub-translation font setting, while the second text's font size is controlled by the main translation font setting.

For saving text, please note that the 'save to text' function only saves one form of text (either the translation or the transcription). The additional texts displayed in Mixture and CoTrans modes will not be saved. This is because the output text is not formatted and it will be confusing for the reader to distinguish between various modes if the content is large.

I hope these updates improve your experience with Hanami. Thank you again for your valuable feedback.

I really appreciate your improvements. Thank you very much for all your hard work.

Everything looks pretty good, I think there's just one more little thing missing. Can you take care of it if you find the time?

I'd like to summarize the problem;

I think it would be better to set the resolution of the program with the mouse instead of entering it manually, just like in normal windows windows. it would be more useful and the last set resolution would be automatically saved in memory and the program would open at whatever the last resolution was the next time.

and in vertical mode, instead of each translation disappearing, depending on the resolution of the program, it might be better to have the previous translation scroll up in a slightly fainter color. so that we can check the previous sentence and by scrolling up and down with the middle button of the mouse, we can switch between the translations as old and new translations and continuity is ensured without stopping the continuity of the translation.

As an example : https://www.youtube.com/watch?v=S5DxgjkjWYQ

the old translations will keep scrolling upwards as in this video. and as you can see, the current words are white but the past words are white. in our system only the last translation will be white and the old translations will be black again.

I hope I was able to explain

Thank you again

MotazSabri commented 2 months ago

Thanks for your kind follow-up and continued use of Hanami, @rappc87 . I'm glad to hear that you appreciate the new features I've implemented.

Regarding your new requests:

  1. Resizing Feature: I understand the importance of this feature. However, the current GUI is not designed to be resizable, and adding this capability would require a complete rebuild of the user interface. While I would be happy to work on this, it will take a significant amount of time.

  2. Lyrics Style Translation: This is indeed a nice feature. However, implementing it would shift the application's primary focus from live translation to more of a dubbing tool.

I will certainly consider both ideas for future releases. In the meantime, I would appreciate it if you could create a new issue for these requests and close this one if you are satisfied with the improvements made regarding your original requests.

Thank you again for your valuable feedback.