Aegisub / Aegisub

Cross-platform advanced subtitle editor
http://devel.aegisub.org/
Other
3.05k stars 331 forks source link

Characters encoded at codepoints beyond `0xFFFF` leading to an error #157

Open niauah opened 4 years ago

niauah commented 4 years ago

I imported an existing srt file and started to edit it, on clicking certain line, there raises an error message like this.

2019-10-24 19-09-31 的螢幕擷圖 2019-10-24 19-09-47 的螢幕擷圖

The line is as follow:

87
00:08:27,900 --> 00:08:31,350
鹿小姐𪜶兩个其中愛佗一个
Lo̍k sió-tsiá in nn̄g ê kî-tiong ài tó tsi̍t ê

which contains a Hanji charactor 𪜶 encoded as U+2A736, by removing it, the error is no longer raised.

In my experience on handling Hanji, characters encoded at codepoints beyond 0xFFFF, which takes more bits in UTF format, often get into problems. A further discussion on how to fix this would be appreciated.

Thanks!

wangqr commented 4 years ago

Possibly related to #121