osmandapp / OsmAnd

OsmAnd
https://osmand.net
Other
4.61k stars 1.01k forks source link

Fix Japanese version of TTS #16861

Closed zipav closed 1 year ago

zipav commented 1 year ago

🚀 feature request

Description

The voice prompts in Japanese currently in use are imperfect, they have certain grammatical errors, and not all parts of the text can be voiced by the OGG file of the official Japanese language. It needs to be fixed for more convenient and universal use. Our user is confident that the appropriateness (completeness) of the context and guide words is better than the official Japanese.

Describe the solution you'd like

Our user, who is in charge of the official Japanese language for the OsmAnd app, offers us some help and can briefly check the official Japanese version for TTS audio with "OsmAnd+ ver4.4.5" and then send us "ja_tts.js".

sonora commented 1 year ago

If there are any questions, I can gladly offer assistance or a review for any technical questions.

ZeppekiHinagiku commented 1 year ago

My name is "Zeppeki Hinagiku" and I was recently introduced to this page by email. I would like to talk about the unofficial Japanese audio "ja_tts.js" that I provide. If I'm not mistaken, I think that the ones I published before and after "OsmAnd ver 3.9" were able to play ogg audio as well. I tried to fix it, but it is not completely playable. If I am remembering this wrong, I apologise.

The "ja_tts(ver408-1).zip" provided is the version before the fix was attempted, so I don't think the ogg version can even be played properly with the current OsmAnd. Now we are planning to modify the content based on the English script of "OsmAnd ver4.5.0 Nightly Build" and create "ja_tts.js (ver450-1)" to be able to play a little ogg audio while removing unnecessary parts. I'm not sure when I'll be able to do that.

However, I don't know when that will be, so I have uploaded the "ja_tts.js (ver408-1)" file.

The attached file "ja_tts.js" was created when "OsmAnd ver 4.0.8" was released and has not been updated for about two years. When I drove the other day to a destination 30-40 km away using the motorway, it worked fine with "OsmAnd+ ver4.3.12". I then tested it with "OsmAnd+ ver4.4.5" and "OsmAnd Nightly Build ver4.5.0" development plug-ins, but there were no problems here either.

It is an unofficial version and I, the translator, give you permission to use it freely. Please feel free to merge it into the official version or use it to correct the grammar of the official Japanese audio.


先日メールにてこちらのページを紹介された”Zeppeki Hinagiku”という者です。 早速ですが、私が提供している非公式日本語音声”ja_tts.js”についてです。 私の記憶違いで無ければ"OsmAnd ver3.9"の前後に公開していたものは、ogg音声も再生できていたような気がします。 その後のアップデートで仕様が変わったのか正常に再生されなくなり、修正を試みましたが完全には再生できていません。 もし私の記憶違いだったら申し訳ありません。

提供する“ja_tts(ver408-1).zip”は修正を試みる前のバージョンなので、今のOsmAndではogg版はまともに再生すら出来ないと思います。 今は"OsmAnd ver4.5.0 Nightly Build"の英語版スクリプトをベースに内容を修正し、不要な部分を削除しつつogg音声も少しだけ再生できる様に”ja_tts.js (ver450-1)”を作成予定です。

ただ何時になるか分からないので、”ja_tts.js (ver408-1)”の方をアップロードさせていただきました。

添付したファイル"ja_tts.js"は"OsmAnd ver4.0.8"の頃に作成したもので、2年ほど更新していません。 先日30~40km先の目的地まで高速道路も利用して運転したときは、"OsmAnd+ ver4.3.12"にて問題なく動作しました。 その後"OsmAnd+ ver4.4.5"と"OsmAnd Nightly Build ver4.5.0"の開発用プラグインでテストしましたが、こちらも一応問題ありませんでした。

非公式版ですし翻訳した私が許可するので、自由に使っていただいて構いません。 公式にマージするなり、公式日本語音声の文法修正にお役立て下さい。

ja-tts(ver408-1).zip

vshcherb commented 1 year ago

Thanks for feedback Ogg file is playable in Android version, but not playable in iOS. So we prefer to deal with TTS versions.

sonora commented 1 year ago

@ZeppekiHinagiku Ok, thanks for the contribution! I have done quite a bit of work on it and think I got it to work ok, but we will likely need a few iterations, because I cannot fully test.

Some of the things I have fixed, changed, and cleaned up are

It should now work for both recording and TTS. Can I please ask you to download the updated recorded Japanese voice from within the OsmAnd app: Could you please test every command via the "Test voice prompts" feature in our Development plugin? Please let me know every command number which does not work to satisfaction, then we can take a look -- quite possibly there are still some flaws. And then please do the same testing for TTS, for that manually copy the ja_tts.js file from the new ja recorded voice to your ja-tts folder, replacing the old file there. Thank you!

ZeppekiHinagiku commented 1 year ago

Thank you for pointing out in detail what is causing it not to work. I checked the new script with the development plugin and all the ogg audio plays. I will send it here after I have corrected any unnecessary words that are still there and any changes that have since been made.

何が原因で動かないか、細かく指摘してくれてありがとうございます。 新しいスクリプトを開発用プラグインで確認したところ、ogg音声は全て再生されました。 まだ残っている不要な単語や、その後変更した部分などを修正してから、ここに送ろうとおもいます。

Zirochkabila commented 1 year ago

@ZeppekiHinagiku Please provide feedback on @sonora request Did you manage to check the voice prompts?

ZeppekiHinagiku commented 1 year ago

Sorry, I am new to GitHub and I don't know how to give feedback. How can I do it? Please tell me.

I have confirmed that I can play the voice prompt. However, there are still some parts that are not grammatically correct as Japanese. I have been busy for a while now and have not been able to fix them.

sonora commented 1 year ago

There's no rush don't worry. 😉

It may be tough for me to help, but we could try this: Perhaps male a table of the "Test voice prompts" button numbers, and mark which ones already sound ok.

And for the ones which do not, please state in 2 colums what you do hear, and in contrast how it should be correct. Maybe you can do it both in Japanese characters as well in terms of the .ogg file names.

I can then look at it and we can pick a specific example and investigate how to fix it? It will be challenging, but could be fun! 😉

Zirochkabila commented 1 year ago

@ZeppekiHinagiku Test voice prompts can be found here: Menu > Plugins > OsmAnd development > Settings > Test voice prompts

ZeppekiHinagiku commented 1 year ago

@sonora Sorry I'm late. There are places where the grammar went wrong when the descriptions and ogg definition parts that had been added were removed there from the original script, which had been optimised for Japanese grammar. I have summarised them in a list of tables, as suggested.

Column A is the number of the voice prompt Column B is whether the grammar is correct in ogg Column C is whether the grammar is correct in TTS Columns D & E are comments on what is wrong or should be fixed, respectively.

The Japanese "、" is used for short pauses, probably the same as "," in English. The Japanese "。" is used at the end of a sentence, as is the English "." and is used at the end of a sentence in the same way as "、", which was also included with the intention of a pause.

Voice_prompts_test(OKorNG).zip

ZeppekiHinagiku commented 1 year ago

@sonora Sorry, I forgot something vital. This is the audio script used in the test above. It's an ogg file renamed to a Japanese name and a script file edited for it.

ja-tts.zip

sonora commented 1 year ago

Ok, thanks! I have tried to make the improvements I think you have in mind. Can you please use the app to download the new ja recorded voice (carrying today's date), and re-do the tests (both for recorded, and please use the new ts_tts.js file there to also re-test it as TTS, overwriting the ja_tts.js file which came with your OsmAnd version).

I hope it is improved now, I have specifically done the following:

Overall I recommend to focus on perfecting the TTT use csse, as it is more powerful in terms of what it can announce, more flexible, has better intonation. And I guess most users of Japanese voice prompts would have ja-enabled TTS engine on their devices. Hence I suggest to treat the recorded ja voice as a fallback only, as is now, not deserving to invest excessive effort.

Zirochkabila commented 1 year ago

@ZeppekiHinagiku We are waiting for your response

ZeppekiHinagiku commented 1 year ago

@sonora thx!

(I know of no good way to put pauses in the recorded voices.)

Create an arbitrary 25-50msec ogg file and use it as a "、" and "。" It was used as a pause instead of ". This was back in the "_config.p" days, so I don't know if it still works. But as far as I checked during this time, I don't think you need a pause anywhere in ogg.

pause_files.zip

sonora commented 1 year ago

@ZeppekiHinagiku Updated, thanks! Please re-download and test.

ZeppekiHinagiku commented 1 year ago

@sonora

dictionary["comma"] = tts ? "、" : "zzz_delay_0025msec.ogg";
dictionary["period"] = tts ? "。" : "zzz_delay_0050msec.ogg";

The above method was no good as the playback was corrupted in ogg.

It seems better to use the following method and pause only on the TTS. (tts ? "、" : " ")

sonora commented 1 year ago

Ok, maybe try using this:

dictionary["comma"] = tts ? "、" : " " + "zzz_delay_0025msec.ogg"+ " ";
dictionary["period"] = tts ? "。" : " " + "zzz_delay_0050msec.ogg" + " ";

If that is not better than before, I will revert to what we had.

So how are we doing regarding all your other points?

ZeppekiHinagiku commented 1 year ago

@sonora Sorry, I didn't explain myself well enough. OGG has enough poses and is unnecessary. Only TTS needs a pause.

I'm checking in between jobs, so the rest of this will have to wait a bit.

sonora commented 1 year ago

Ah, ok, thanks! So I have reverted to what we had before the "comma" and "period" methods.

Yes, please take your time to test all this, there is no rush! I think we have already achieved a lot, thanks to all your help!

If you find any further improvements, we can continue to investigate here. If you have tested code changes xou know work well, you could also make a pull request to the file directly on github, the file is here: https://github.com/osmandapp/OsmAnd-resources/tree/master/voice/ja

ZeppekiHinagiku commented 1 year ago

It was found that the reason for the non-vocalisation in the Ogg version was the removal of [+ " " +].
Instead of [+ " " +], we also found out that it does not matter if we put [(tts ? "、" : " ")] or [(tts ? "。" : " ")].

The TTS version also has new specifications and branching conditions, so I rewrote almost everything, referring to the latest "en_tts.js", to make sure the syntax and pause are correct.
I also added and changed some sound sources for TTS and ogg accordingly.
As far as I checked with the plugin, the context and pauses are now perfect in both the TTS and ogg versions.

As for ["exceed_limit"], I looked at the OpenStreetMap specification and found that it is not a good idea to add units on our own, so I added a polite word at the end of the word without adding units.
https://wiki.openstreetmap.org/wiki/Key:maxspeed
However, looking at the above specifications, there is no unit at the end of the speed limit = "kilometer per hour"(km/h).
with "mph" at the end = "mile per hour"(mp/h)
with "knots" at the end = "knots per hour"(kt).

If it is possible to identify the units as above, and if OsmAnd can read those units and branch with a switch statement, I think it would be possible to accurately speach each unit separately.

There are too many changes to send in a pull request and we have also added and changed the ogg file relationships, so we will send them as attachments.

OsmAndVoice_JP4.5.0-3.zip

sonora commented 1 year ago

Ok, thank you for the tweaking! I have committed it here: https://github.com/osmandapp/OsmAnd-resources/commit/a27ac1cc2d30f89d9afe1dfa6e2f6f028309f2d2. (As before, the new recorded version is available under "Downloads" immediately, the new ja_tts.js config file will only be embedded in a new app (apk) update. But it is identical to the one contained in the recorded voice, so you could use that for testing.)

Regarding the "speed limit unit" code you have commented out: At first sight this should work, have you tested it? OsmAnd passes the user preference for the unit system (e.g. Driving Region) into var metricConst, so that is available to be used. I have, however, not tested, OsmAnd's interpretation and perhaps numeric conversion of maxspeed tagging with units specified in OSM. If you get a chance, maybe test it and open a separate issue if you detect one.

I will close this issue, but feel free to post further updates here, or better open a Pull Request. It looks like we have come a long way in improving the OsmAnd Japanese Voice announcements, Thank You!!

@vshcherb Probably worth a mentioning and credit in the v4.5 release notes

ZeppekiHinagiku commented 1 year ago

@sonora The processing of the commented-out part has already been confirmed to work before.

But according to OpenStreetMap specifications, if the speed unit in the OsmAnd setting is "kilometres per hour" and "maxspeed=50", then speach "50 kilometres per hour" (in Japanese).
If the OsmAnd speed unit is "miles" and the “maxspeed=50 mph", then it will speach "50 mph mile per hour". I commented out because I was concerned that if the speed unit was in miles, I would vocalise the units in duplicate as above.

However, I prefer to vocalise the units for use in my own country, so I left it as a deprecated hidden feature.

sonora commented 1 year ago

Ok! Please note that if you follow the en_ttstemplate), it never pronounces a speed limit unit

The "Driving region" setting in OsmAnd is used to produce the (properly converted, if necessary) numeric value in OsmAnd, i.e.

That numeric value is always spoken without a unit. Example: For a highway which is tagged with maxspeed=75 mph in the United States,

So if you configure to speak the unit in ja_tts.js )like your commented out code should do correctly), I see no risk of the unit being pronounced twice.

ZeppekiHinagiku commented 1 year ago

@sonora thx! I understanded, if "maxspeed" in OsmAnd does not have a unit, then there seems to be no problem in speech it with a unit.

However, if "Units of length" is set to "Nautical miles/meters" or "Nautical miles/feet" If you set the "Units of length" to "Nautical miles/metres" or "Nautical miles/feet", you will get an error "undefined" in plugin test 9.1.
See the attached screenshot for more information.

In the above case, can you tell me what value be contained into "metricConst"(km-m/mi-f/etc...)? I think I can submit an error-free ja_tts.js if you can tell me what values be contained into the "metricConst".

OsmAnd+_Screenshot.zip

sonora commented 1 year ago

See here: https://github.com/osmandapp/OsmAnd/blob/a08254c6b600675ef253090509c3c304775908b9/OsmAnd/src/net/osmand/plus/settings/enums/MetricsConstants.java#L7-L14

ZeppekiHinagiku commented 1 year ago

@sonora Thx! I have fixed branching. Improved grammar and intonation to be more natural. Ogg files were changed and deleted accordingly.

OsmAndVoice_JP4.5.0-4.zip

sonora commented 1 year ago

Thank you - merged!

musover commented 1 year ago

As the author of the previous ja_tts overhaul back in 2018, thank you very much for your work. I assume it was evident that it was not done by a native speaker.

However, I have a few questions about some of the replacement strings:

I sought to eliminate most if not all of the XXに曲がってください as I perceived them to be way too chatty. I'm not sure I understand why they are preferable to 手前XX方向 or 斜めXX方向, and I would like to know why.

Regards

EDIT (sonora): Thank you for your feedback! @ZeppekiHinagiku , perhaps I can ask you to comment this.

ZeppekiHinagiku commented 1 year ago

First of all, let's assume... Around 2013-2014, I decided to create an unofficial Japanese audio for my own use, as the official Japanese audio of OsmAnd was not sufficient. I originally planned to publish it on the web when it was ready, so I created it from scratch, without appropriating the terminology of the official Japanese audio. This was because I knew of cases in the past where trouble had occurred due to translation from English to Japanese and the misappropriation (plagiarism) of those resources.

After actually using the system, we went through a lot of trial and error and made changes to the terminology, such as "右左折(usasetsu) and 走行(soukou) are not suitable for walking, so let's change the words". Of course, not everything fitted perfectly, and my understanding of the script was not perfect, so I couldn't come up with good vocabulary, and some of the terms were left unexplored for many years.

Furthermore, after using the system, I realised that not all people concentrating on driving can instantly distinguish the short, simplified words uttered in the navigation system. So I deliberately mix in long (chatty) strings to make it easier to understand what strings were uttered. This is the same reason why, early on, we changed '右折(usetsu)' and '左折(sasetsu)' into '右方向(migi houkou)' and '左方向(hidari houkou)' - strings that are long but easy to instantly understand the difference.

I could have incorporated the good parts of the official terminology, but as mentioned above, I knew that there was a possibility of trouble if the unofficial version was incorporated into the official version without permission, so I didn't do that. And the unofficial voice is basically tailored to be easy for me to use, so that people who are poor drivers, like me, do not misunderstand the navigation instructions. If you think that a rational, short voice is the best solution for navigation, then I guess our philosophies are incompatible.

I had originally planned to stay away from translation from now on and just send the Japanese audio and scripts I had made to the OsmAnd people. However, with the help and encouragement of others, I was able to further brush up on the unofficial voice and end up with a satisfactory result. The unofficial voice I had customised to make it easier to me use ended up being used as the official voice.

It is intentional that some strings are too chatty, but I don't intend to get too involved in the future, so you can change them if you have better ones. In fact, '左手前の道です(hidari temae no mitchi desu)' is a better Japanese guide, as it is more concise and easier to understand than '左後方に曲がってください(hidari kouhou ni magatte kudasai)'. This is due to my poor imagination that I couldn't come up with the word '手前(temae)' XD

It's a long story, but please understand that I did not use official Japanese strings for a reason, and that I intentionally used chatty and non-talkative strings.

Attached is an early audio script that could attest to this. :)

まず前提として… 2013~2014年頃、OsmAndの元々の日本語音声では不十分だったので、私は自分用に非公式の日本語音声を作ろうと思い立ちました。 元々できたらWebで公開しようと思っていたので、元々の日本語音声の用語を流用せず、いちから作りました。 過去に英語から日本語への翻訳やそのリソースの流用(盗用)で、トラブルが発生したという事例を知っていたからです。 実際に使ってみて「右左折や走行だと徒歩用途に向かないから、単語を変えよう」など、用語も試行錯誤や変更を重ねました。 もちろん全てピッタリ当てはまったわけでは無く、スクリプトの理解も完璧では無かったので良い語彙が思い浮かばず、長年気になりながら放置したものもあります。

さらに使ってみてナビゲーションで発せられた短い簡略な言葉を、運転に集中している全ての人間が即座に聞き分けられるとは限らないと実感しました。 なのでどんな言葉が発せられたのか分かりやすいように、意図的に長ったらしい(お喋りな)のを混ぜています。 初期に『右折』『左折』を『右方向』『左方向』と長いけど、即座に違いを理解しやすい単語に変えたのも同じ理由です。

公式の用語も良い部分は取り入れれば良かったのでしょうが、前述の通り非公式の物が公式の物を許可無く取り入れればトラブルになる可能性があると知っていたのでそれはしませんでした。 そして非公式音声は基本私が使いやすいように、私と同じく運転の下手な人もナビゲーションの指示を勘違いしないように調整してあります。 合理的で短い音声こそナビゲーションの最適解だと思われるのでしたら、その辺りの思想は相容れないものでしょう。

そもそも私は今後翻訳から手を引くつもりで、OsmAnd関係者に自作の日本語音声とスクリプトを送るだけのつもりでした。 ですが他の人の協力と後押しにより、非公式音声を更にブラッシュアップさせ結果的に満足いく仕上がりにすることが出来ました。 そして私が使いやすくカスタマイズした非公式音声が、そのまま公式音声に採用されることになりました。

いくつかの単語がおしゃべりすぎるのは意図的ですが、今後あまり関わるつもりもないので、より良い単語があれば変えていただいても構いません。 実際に『左後方へ曲がって下さい』よりは『左手前の道です』などの方が簡潔で分かりやすく、良い日本語案内だと思います。 これは『手前』という単語を思いつかなかった、私の発想の貧弱さ故ですね (-_-;

長くなりましたが、理由があって公式日本語の単語を使わなかったこと、おしゃべりな単語とそうでない単語を意図的に使っていたとご理解下さい。

添付ファイルはそれを証明しうる初期の音声スクリプトです (^^; OsmAndVoice_JP1.7.4-2.zip

musover commented 1 year ago

I understand your reasoning, and our philosophies regarding voicelines may be incompatible after all.

My main use for the app is while driving, and as such I prefer shorter voicelines so that I can have more time to prepare for a turn. My choices for vocabulary align with other GPS systems that have Japanese translation.

However, seeing how other languages have more than one translation, e.g. a case would be Hungarian with 2 TTS styles, an option we can consider is to include both styles of TTS, one being casual and another being more formal.

That being said, I made some adjustments for my personal use:

I am testing less verbose replacements for まっすぐに進んでください and 道なりに進んでください , currently 直進方向です and 道なりです are sounding good to me.

Due to how exit names are used in other parts of the world, especially Italy which does not use exit numbers in most of their roads, I replaced the final 「へ、入ります」with です in take_exit, as it might be confusing (I would be looking for a road with that name instead of an exit).

If those sound good to you, I can just place them in a PR, and if not, I don't think we are being forced to choose which style is the correct one.

Correct me if I'm wrong but I think tts files can now be placed in a folder and don't have to be bundled with the apk, because if so, it would lift a huge burden of having to assemble new apks for any of us to use different translations.

sonora commented 1 year ago

Yes, you need to find out how to access OsmAnd's data folder on your device (may be simple or difficult). If then in (OsmAnd-data-folder)/voice (next to jp-tts) you create a new sub-folder like jp-test-tts, and in it place a new version of the config file and name it jp-test_tts.js (where the 'test' portion is arbitrary), it will show up in OsmAnd to chose as a language after the next app start.

If need be, I would have no objections to having 2 jp versions like e.g. jp and jp-chatty, etc. For ease of maintenance it would be good if we kept the 2 .js files as similar as possible, e.g. develop one as a modified version of the other.