bbepis / XUnity.AutoTranslator

MIT License
1.94k stars 289 forks source link

Newline character disappear. #199

Open nathan60107 opened 3 years ago

nathan60107 commented 3 years ago

Hi, I'm using XUnity.AutoTranslator 4.13 to play Mini Healer. Translate from zh-CN to zh-TW using Implementing a Translator with zhconvert. I implement it successfully and everything works fine, except for something \r\n will disappear.

For example, I check _AutoGeneratedTranslations.txt and find: +20 遊俠物理傷害 \r\n+5.9%遊俠攻擊速度 \r\n+17.7%遊俠躲閃率 \r\n<color%3D#6983FFFF>+4 遊俠冰霜傷害</color> \r\n<color%3D#EFD73C>+4 遊俠閃電傷害</color> =+20 遊俠物理傷害+5.9%遊俠攻擊速度+17.7%遊俠躲閃率 \r\n<color%3D#6983FFFF>+4 遊俠冰霜傷害</color> \r\n<color%3D#EFD73C>+4 遊俠閃電傷害</color> , which will translate

+20 遊俠物理傷害 
+5.9%遊俠攻擊速度 
+17.7%遊俠躲閃率 
<color%3D#6983FFFF>+4 遊俠冰霜傷害</color> 
<color%3D#EFD73C>+4 遊俠閃電傷害</color>

to

+20 遊俠物理傷害+5.9%遊俠攻擊速度+17.7%遊俠躲閃率 
<color%3D#6983FFFF>+4 遊俠冰霜傷害</color> 
<color%3D#EFD73C>+4 遊俠閃電傷害</color> 

Some \r\n disappear.

And I go back to the dll I build to debug. I output the text before translate and after translate it, and find there are three translations relating to it.

  1. +4 遊俠閃電傷害
  2. +4 遊俠冰霜傷害
  3. +20 遊俠物理傷害+5.9%遊俠攻擊速度+17.7%遊俠躲閃率

The order of them is by the order it printed to txt file. So the original text is split into three different texts to be translated. But due to some reason +20 遊俠物理傷害 \r\n+5.9%遊俠攻擊速度 \r\n+17.7%遊俠躲閃率 is not split. Only the \r\n with < or > do the split.

How can I fix this problem or will you fix that problem? I there are anything I can help you debug, please tell me. Thanks for making XUnity.AutoTranslator.

※Update1: Although I find that 4.15 * BUG FIX - Fixed issue related to newline handling. I try 4.16.2 and find that the issue still happens

Before translate: 圖片 After translate: 圖片

※Update2: By setting IgnoreWhitespaceInDialogue=False, this problem can be solved temporarily. But the problem of newline char still needs to be fixed.

※Another problem: I find that running translations always be 0 or 1. That means it will not use threading or other parallel way to deal with the translation. It is ok for an engine like Google which response very fastly, but will end to wait for queuing for a slow engine like zhconvert.

gravydevsupreme commented 3 years ago

Hi.

So a lot of preprocessing is happening before a text is sent to the translation endpoint. Reasons for this include:

The good news is that all of this is configurable. You've already found one of the configuration parameters, the poorly named IgnoreWhitespaceInDialogue. This will ensure that newline characters are preserved when sent to the endpoint.

The exception to this rule is that when a string can be parsed as containing rich text (markup), it will preserve the whitespace of the parsed string and translate the parsed contents individually, which is why the newlines are preserved within the marked up text.

You can entirely disable markup handling by setting HandleRichText=False. I am not sure you want to set this to false, though. Would probably cause more trouble than it would solve.

I'm not sure what you're asking in relation to that running translations question. What I can say is that what it means is the following:

nathan60107 commented 3 years ago

I agree with Most endpoints handle rich text poorly. Setting in "<>" maybe translate and mess up rich text.

About Most endpoints handle newlines poorly, is it trying to solve the problem of redundant newline? For example, the game wants to show XUnity.AutoTranslator is a good software, but it inserts a newline to force it to change line and become

XUnity.AutoTranslator is a
 good software

which will be treated as two sentences and be translated incorrectly. If so, there are no any bugs there. But I wonder if IgnoreWhitespaceInDialogue should be default set as True. Is the situation above common happens? Remove newlines default may break the layout potentially. It is a hard problem to decide which is the default.

About running translations, in fact, I'm trying to find some way to let running translations greater, so what you say is just what I need 😃. I add public new int MaxConcurrency => 3; to my dll (since VS says it is not virtual, abstract, or override, so I can not use public override int), but running translations is still 0 or 1, which means it still translate one by one and I still should wait for queuing. Here is the full file.

using SimpleJSON;
using System;
using XUnity.AutoTranslator.Plugin.Core.Endpoints;
using XUnity.AutoTranslator.Plugin.Core.Endpoints.Http;
using XUnity.AutoTranslator.Plugin.Core.Utilities;
using XUnity.AutoTranslator.Plugin.Core.Web;

namespace zhConvert
{
    internal class CustomTranslateEndpoint : HttpEndpoint
    {
        private static readonly string targetUrl = "https://api.zhconvert.org/convert?converter=Taiwan&text={0}";

        public override string Id => "zhConvert";

        public override string FriendlyName => "zhConvert(繁化姬)";

        public new int MaxConcurrency => 3;

        public override void Initialize(IInitializationContext context)
        {
            context.DisableCertificateChecksFor("api.zhconvert.org");
        }

        public override void OnCreateRequest(IHttpRequestCreationContext context)
        {
            var request = new XUnityWebRequest(
               string.Format(
                  targetUrl,
                  Uri.EscapeDataString(context.UntranslatedText)));

            context.Complete(request);
        }

        public override void OnExtractTranslation(IHttpTranslationExtractionContext context)
        {
            var data = context.Response.Data;
            var obj = JSON.Parse(data);

            var code = obj.AsObject["code"].ToString();
            if (code != "0") context.Fail("Received bad response code: " + code);

            var token = obj.AsObject["data"]["text"].ToString();
            var translation = JsonHelper.Unescape(token.Substring(1, token.Length - 2));

            if (string.IsNullOrEmpty(translation)) context.Fail("Received no translation.");

            context.Complete(translation);
        }
    }
}

Is there any mistake make it cannot use multiple coroutines to handle the translation?

gravydevsupreme commented 3 years ago

About newlines: This behavior is on by default because I found that it helps more than it hurts in most games. And I tried it in a lot of games. And yes, your deduction as to why is correct. This is only a translation-time thing, so the untranslated text output to the translation file is kept intact, and any regexes that may be specified will work on the original text.

About concurrency: You are right. I seem to have disabled allowing concurrency for anything that extends HttpEndpoint or WwwEndpoint. I probably did it this way because I would not want anyone to implement an endpoint that spammed a translation service too severely. Could be changed since it's a developer thing and not a user thing.

Either way, you can always just copy the HttpEndpoint and HttpTranslationContext (and related interfaces) into your own project and use those instead, allowing you to change the MaxConcurrency property to virtual.

nathan60107 commented 3 years ago

About concurrency: In fact, the translation of Google Translate is also MaxConcurrency 0 or 1. So it seems that even build-in endpoints use MaxConcurrency=1. In my opinion, maybe you can check the range of MaxConcurrency. For example, if it greater than 5, fix it to 5. So we can set it to a reasonable value without changing other source code files.

I come up with another problem. When I trying to debug dll, I use Console.Write to try to output text before and after translation. But the console of W10 is encoded by 932, which is Japanese, not Chinese. Finally, I turn to write text to txt file to avoid this problem. 圖片 There is a way to change the encoding by using chcp command, but in debug console, I have no way to enter it. Maybe the console encoding should be set to 65001 which is Unicode(UTF-8) for everyone to debug?

And one more problem. Some games hide cursor so it is hard to use the cursor to click GUI of XUnity.AutoTranslator. Is it possible to force show cursor when GUI is enable?

gravydevsupreme commented 3 years ago

Concurrency: I will make the call virtual in next version, so it can be overridden.

Console: I am aware that the console implementation in this plugin is not great. I would highly recommend using BepInEx instead of self-patching through ReiPatcher. In that way you do not have to modify any files and you get a better console implementation.

Cursor: Probably possible, though not aware how. I usually ensure I enter a menu or similar before opening the UI so I do not have to deal with this problem.