honghaoz / Ji

Ji (戟) is an XML/HTML parser for Swift
MIT License
824 stars 65 forks source link

GBK编码的HTML字符串初始化Ji后里面的内容全乱码了 #22

Closed zixun closed 8 years ago

zixun commented 8 years ago

网页是GBK编码的,然后我这样初始化Ji

 let NSGBKStringEncoding = CFStringConvertEncodingToNSStringEncoding(CFStringEncoding(CFStringEncodings.GB_18030_2000.rawValue))
 var htmlString = String(data: data!, encoding: NSGBKStringEncoding)
 let ji = Ji(htmlString: htmlString!, encoding: NSGBKStringEncoding)

我这样转换后ji里的内容中文的全乱码了,不知道为什么

网站是梦幻西游论坛:http://my.netease.com/forum.php

honghaoz commented 8 years ago

@zixun Thanks for your issue, I'll take a look on that. Not sure whether it's related to this issue https://github.com/honghaoz/Ji/issues/5#issuecomment-135741737

zixun commented 8 years ago

@honghaoz OK,thank you~~ waiting for you~

zixun commented 8 years ago

@honghaoz I found the reason!! Its not the mistake of Ji,but me. the HTML String from Web is based on GBK,but I change some HTML code by

 stringByReplacingOccurrencesOfString: withString:

and this api return a NSUTF8StringEncoding String.

if I tell the Ji the string encoding is UTF8,everything will work well. if I not call the string api above,and tell Ji the encoding is GBK,it also work well~

honghaoz commented 8 years ago

@zixun Ahaha, cool, thanks for your responding! I believe this issue could be closed