Some halfwidth chars were not properly transliterated

fanweihua commented 7 years ago

I used "ｹﾞｯﾄ" to test Java Api, the expected result should be "ゲット". There are 2 issues in the actual output. The first one is ﾞ was not combined with ｹ as one word. The 2nd one is ｯ was not transformed at all.

andywork commented 6 years ago

あと追加です。すみません。「ャ」「ュ」「ョ」が半角になってくれませんです。as3版です。

chengstone commented 5 years ago

change function toZenkakuCase to following code:

public static String toZenkakuCase(String str)
{
    int f = str.length();
    StringBuilder buffer = new StringBuilder(str);

    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);

        if(H2Z.containsKey(c)){
            buffer.setCharAt(i, H2Z.get(c));
        } else if(c == 0x0020){
            buffer.setCharAt(i, '\u3000');
        } else if(c <= 0x007E && 0x0021 <= c) {
            buffer.setCharAt(i, (char)(c + 0xFEE0));
        }
    }

    str=buffer.toString();
    f = str.length();
    buffer = new StringBuilder(str);
    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);
        if ((0x304B <= c && c <= 0x3062 && (c % 2 == 1)) ||
                (0x30AB <= c && c <= 0x30C2 && (c % 2 == 1)) ||
                (0x3064 <= c && c <= 0x3069 && (c % 2 == 0)) ||
                (0x30C4 <= c && c <= 0x30C9 && (c % 2 == 0))) {
            char d = buffer.charAt(i + 1);
            buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : 0)));
            if (c != buffer.charAt(i)) {
                buffer = buffer.deleteCharAt(i + 1);
                f--;
            }
            continue;
        }

        if ((0x306F <= c && c <= 0x307D && (c % 3 == 0)) ||
                (0x30CF <= c && c <= 0x30DD && (c % 3 == 0))) {
            char d = buffer.charAt(i + 1);
            buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : ((d == '\u309C') ? 2 : 0))));
            if (c != buffer.charAt(i)) {
                buffer = buffer.deleteCharAt(i + 1);
                f--;
            }

            continue;
        }
    }

    return buffer.toString();
};

andywork commented 5 years ago

Thanks for the correction!

chengstone commented 5 years ago

すみません、先週のcodeを改修した後で、新しbugを見えました。再改修した：

public static String toZenkakuCase(String str)
{
    int f = str.length();
    StringBuilder buffer = new StringBuilder(str);

    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);

        if(H2Z.containsKey(c)){
            buffer.setCharAt(i, H2Z.get(c));
        } else if(c == 0x0020){
            buffer.setCharAt(i, '\u3000');
        } else if(c <= 0x007E && 0x0021 <= c) {
            buffer.setCharAt(i, (char)(c + 0xFEE0));
        }
    }

    str=buffer.toString();
    f = str.length();
    buffer = new StringBuilder(str);
    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);
        if ((0x304B <= c && c <= 0x3062 && (c % 2 == 1)) ||
                (0x30AB <= c && c <= 0x30C2 && (c % 2 == 1)) ||
                (0x3064 <= c && c <= 0x3069 && (c % 2 == 0)) ||
                (0x30C4 <= c && c <= 0x30C9 && (c % 2 == 0))) {
            if(i + 1 < buffer.length()){
                char d = buffer.charAt(i + 1);
                buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : 0)));
                if (c != buffer.charAt(i)) {
                    buffer = buffer.deleteCharAt(i + 1);
                    f--;
                }
                continue;
            }
        }

        if ((0x306F <= c && c <= 0x307D && (c % 3 == 0)) ||
                (0x30CF <= c && c <= 0x30DD && (c % 3 == 0))) {
            if(i + 1 < buffer.length()){
                char d = buffer.charAt(i + 1);
                buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : ((d == '\u309C') ? 2 : 0))));
                if (c != buffer.charAt(i)) {
                    buffer = buffer.deleteCharAt(i + 1);
                    f--;
                }

                continue;
            }
        }
    }

    return buffer.toString();
};

andywork commented 5 years ago

Thank you. I will use the new version.

chengstone commented 5 years ago

“if(i + 1 < buffer.length()){”を追加しました

shogo4405 / KanaXS

Some halfwidth chars were not properly transliterated #1