hjnhjn123 / opencc

Automatically exported from code.google.com/p/opencc
0 stars 1 forks source link

支援含BOM的UTF-8文字檔案 #26

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What is the problem? How to reproduce the problem?
有何問題?如何重現問題?

http://i.imgur.com/8sDyM.png

http://i.imgur.com/nHZ8j.png

(以上網址的圖片也放入回報的附加壓縮檔案裡面)

長時間使用
發現有些詞彙明明在我的自訂詞彙字典檔案裡面了
卻還是無法正確轉換

看不懂原始碼
只好慢慢抽絲剝繭一個一個排除
花了很長的時間,最後發現字典的第一個欄位無作用?

透過最簡化的方式呈現問題發生的環境
請查看一下是不是BUG還是我設定的問題?
謝謝您

What version of the product are you using? On what operating system?
您在用什麼版本?在什麼平臺下?

opencc-0.3.0-win32
win7 x64

Please provide any additional information below.
請把附加信息寫在下面。

壓縮檔案內附:

novel.txt 測試的原始文件
novel2.txt 經過opencc後的文件
test.ini   字典組態檔案
testbug1.png  圖片解說第一欄位無效
testbug2.png  圖片解說避開第一欄位後的效果
testtest1.txt 字典檔案

Original issue reported on code.google.com by roy.yu...@gmail.com on 12 Nov 2012 at 3:02

Attachments:

GoogleCodeExporter commented 8 years ago
testtest1.txt 
多了BOM字符。于是第一行待匹配文字成了「‹BOM›一國」。
如图。
比较 testtest1-no-bom.txt 與 testtest1.txt 。

Original comment by chen....@gmail.com on 13 Nov 2012 at 3:43

Attachments:

GoogleCodeExporter commented 8 years ago
參考: http://www.auiou.com/relevant/00000470.jsp

Original comment by chen....@gmail.com on 13 Nov 2012 at 3:45

GoogleCodeExporter commented 8 years ago
明白了,原來是這樣~
字典檔案存成utf8不含BOM就正常了。
我真是搞笑了~
謝謝回復啦~

Original comment by roy.yu...@gmail.com on 13 Nov 2012 at 4:22

GoogleCodeExporter commented 8 years ago
小弟廢話一句,Comment 
#2的link提到的兩個軟件均不是自由/開源/免費軟體。
使用自由軟體Notepad++也可以在Windows下去除純文本檔案的BOM。

Original comment by damage3...@gmail.com on 13 Nov 2012 at 4:25

GoogleCodeExporter commented 8 years ago

Original comment by byvo...@gmail.com on 15 Nov 2012 at 6:31

GoogleCodeExporter commented 8 years ago
BOM在Windows環境還是比較常見的,我會在近期嘗試解決這個問�
��。

Original comment by damage3...@gmail.com on 15 Nov 2012 at 6:39

GoogleCodeExporter commented 8 years ago
I made a pull request for this issue, please review it.
https://github.com/BYVoid/OpenCC/pull/13

Original comment by damage3...@gmail.com on 16 Nov 2012 at 5:17

GoogleCodeExporter commented 8 years ago
Fixed by 
https://github.com/BYVoid/OpenCC/commit/b8d6c994037878fbc5fa7f21941f8bee08f94e39

Original comment by damage3...@gmail.com on 20 Feb 2013 at 8:42