rime / rime-cantonese

Rime Cantonese input schema | 粵語拼音輸入方案
https://jyutping.net/
Creative Commons Attribution 4.0 International
551 stars 61 forks source link

Using google input tools api #89

Closed g0rdonL closed 4 years ago

g0rdonL commented 4 years ago

I really appreciate the efforts of the project. Typing has to be very precise to get the Chinese characters, whereas Google's implementation is more loose and can guess the words that the user is trying to type.

I would like to implement/fork to get Google input tool's Cantonese into rime and help would be appreciated.

There is an "api" for google input tools. e.g. calling the api with input neiho

https://inputtools.google.com/request?text=neiho&itc=yue-hant-t-i0-und&num=5&cp=0&cs=1&ie=utf-8&oe=utf-8&app=test

this returns:

[
  "SUCCESS",
  [
    [
      "neiho",
      ["你好", "你可", "您可", "您好", "妳好"],
      [],
      {
        annotation: ["nei hou", "nei ho", "nei ho", "nei hou", "nei hou"],
        candidate_type: [0, 0, 0, 0, 0],
        lc: ["69 69", "69 69", "69 69", "69 69", "69 69"],
      },
    ],
  ],
];
ayaka14732 commented 4 years ago

Thank you for pointing out the API.

Typing has to be very precise because one of the goals of our project is to let users know what the correct Jyutping is. However, you can always customize the regex in jyut6ping3.schema.yaml to get more fuzzy results, like 你 (nei -> lei).

While I am concerned that the API does not use correct Jyutping, it could be quite useful for typing rare words, new words and long sentences that are not presented in rime-cantonese dictionary. And calling the API from rime is possible, see hchunhui/librime-cloud.

laubonghaudoi commented 4 years ago

Thanks for your appreciation. As for "typing has to be very precise to get the character", we actually make it be so deliberately. Yes we are prescriptive and we believe that input method is the most efficient tool to help people learn the correct and precise spellings. The overly broad support of fuzzy input in Gboard Cantonese is actually a downside and one of our biggest complaints, because it might mislead people to incorrect/irregular spellings and pronunciations. A redundant amount of loose matching also brings too many candidate words in the menu which significantly lowers the typing speed. So by default you need to type the precise/fully-correct spelling to get the character, this is designed on purpose.

But as described by @ayaka14732 above, you can always change the regex in jyut6ping3.schema.yaml file to support fuzzy input. Instructions are in the README doc. Please let us know if you still have any difficulties or issues.

As for using the Google api, can we know the purpose of forking google's api into rime? What are you trying to achieve with it?

ayaka14732 commented 4 years ago

Sample lua script:

local json = require("json")
local http = require("socket.http")
local ltn12 = require("ltn12")

local function translator(input, seg)
   local url = 'http://inputtools.google.com/request?text=' .. input .. '&itc=yue-hant-t-i0-und&num=1&cp=0&cs=1&ie=utf-8&oe=utf-8&app=test'
   local res = {}
   local _ = http.request{url=url, sink=ltn12.sink.table(res)}
   res = table.concat(res)
   local success, j = pcall(json.decode, res)
   if success and j[1] == "SUCCESS" and j[2] and j[2][1] and j[2][1][2] and j[2][1][2][1] then
      local _e = nil
      if j[2][1][4].matched_length then
         _e = seg.start + j[2][1][4].matched_length[1]
      else
         _e = seg._end
      end

      local c = Candidate("simple", seg.start, _e, j[2][1][2][1], "(Google Cloud)")
      c.quality = 2
      yield(c)
   end
end

return translator

Result (the first candidate):

However, I have tried many cases, and in most cases the API gives unexpected results due to its fuzziness. Perhaps the API would not be as useful as I had expected.

laubonghaudoi commented 4 years ago

@ayaka14732 我記得用雲輸入api呢個嘢好似以前有人做過(就係你做嘅?),不過好似唔係用google,係將百度雲輸入整落rime入邊

ayaka14732 commented 4 years ago

@g0rdonL Do you know whether there is a stricter version (using correct Jyutping while performing basic correction) of the API?

g0rdonL commented 4 years ago

@ayaka14732 unfortunately no. I got the link from checking the network calls when using the google input tools chrome extension.

ayaka14732 commented 4 years ago

@ayaka14732 我記得用雲輸入api呢個嘢好似以前有人做過(就係你做嘅?),不過好似唔係用google,係將百度雲輸入整落rime入邊

就係 hchunhui/librime-cloud,不過佢凈係支援普通話。

laubonghaudoi commented 4 years ago

@ayaka14732 我記得用雲輸入api呢個嘢好似以前有人做過(就係你做嘅?),不過好似唔係用google,係將百度雲輸入整落rime入邊

就係 hchunhui/librime-cloud,不過佢凈係支援普通話。

其實我仲係唔知樓主想我哋做乜,如果係加入 google 雲輸入嘅話就直接打個 won't fix 就得嘞,我哋反正之後都要聯繫Google 叫佢哋加粵拼