Closed jaysonng closed 2 years ago
after a bit more debugging, I've found out that it's a problem with the encoding.
.utf8 doesn't work but .ascii does.
I fixed the problem by changing the function
HTML(html: Data, url: String? = nil, encoding: String.Encoding, option: ParseOption = kDefaultHtmlParseOption)
to
// NSData
public func HTML(html: Data, url: String? = nil, encoding: String.Encoding, option: ParseOption = kDefaultHtmlParseOption) throws -> HTMLDocument {
if let htmlStr = String(data: html, encoding: encoding) {
return try HTML(html: htmlStr, url: url, encoding: encoding, option: option)
} else if let htmlStr = String(data: html, encoding: .ascii) {
return try HTML(html: htmlStr, url: url, encoding: encoding, option: option)
} else {
throw ParseError.EncodingMismatch
}
}
I created a PR for this issue.
It seems that the website with that URL uses a charset other than UTF-8. (The header response states that charset is UTF-8, but it doesn't seem to be correct.)
I think you need to handle this case in your codebase, not in library(Kanna).
You can see that charset is not UTF-8 by following the steps below.
$ curl -L https://www.philstar.com/headlines/2021/11/26/2144018/philippines-intently-monitoring-new-covid-19-variant-detected-south-africa > ./source.html
$ file --mime ./source.html
./dump.txt: text/html; charset=unknown-8bit
Got it.
Thanks for checking.
So the fix I did a PR on won't be pulled ? I need to move the logic into my library?
Yes, please move the logic into your code.
Thanks
Description:
I'm building an app that parses open graph data from the html. For this particular news site, its articles are returned with an error of
The operation couldn’t be completed. (Kanna.ParseError error 1.)
I'm hoping it's something we can fix with Kanna xml parser (I'm no xml expert so I can't go further than knowing I don't get back an HTML document) or is this a website issue ?
The link is
this article
If it helps, here is the URL Header response.
thanks,
Installation method:
Kanna version (or commit hash):
5.2.7
swift --version
Swift 5.5
Xcode version (optional):
13.1