nmdias / FeedKit

An RSS, Atom and JSON Feed parser written in Swift
MIT License
1.19k stars 174 forks source link

Character encoding #62

Closed cortnum closed 5 years ago

cortnum commented 6 years ago

Hi,

I really like your pod, it's working great except for one issue.

I'm having problems with parsing a Danish RSS feed. The strings are not coming back with correct encoding and therefore weird characters are showing up in the app.

Help would be appreciated :)

nmdias commented 6 years ago

Hi @cortnum,

I got a few questions. What version of FeedKit are you using right now? Is this a UTF-8 encoded feed? Also, is this a public feed? If yes, could you provide a link to check it out?

Thanks

cortnum commented 6 years ago

Hi @nmdias,

Thx for the quick response!

It is public, it is xml, here is the link: https://www.bold.dk/feed/rss_by_tag/7976 I see the encoding is set to ISO-8859-1 I am using the newest version: 8.0.0

/Mikkel

nmdias commented 6 years ago

@cortnum would you mind trying out version 7.1.0? I think that one will work for you.

Version 7.1.1 and above is automatically converting to UTF-8. It's now my understanding that it should handle encoding from the specified encoding in the XML file instead of forcefully converting it to utf-8. I'll do it eventually. For now, if you don't mind trying 7.1.0. It looks good here, but I don't speak Danish, so I'm not sure if these characters are all correct.

Thanks

cortnum commented 6 years ago

I tried with version 7.1.0, but now the initializer for the FeedParser returns nil.

I made a small hack to convert the characters to their Danish equivalent with version 8.0.0

Will watch for updates - again awesome library you have made and I can't wait to check out your DefaultsKit, looks really cool too.

Thanks again for the quick response :)

/Mikkel

nmdias commented 5 years ago

Hi, @cortnum

It's been awhile, but here goes. After some wall banging, I don't see FeedKit adopting this right now, as it uses Apple's XMLParser, and as soon as I provide another encoding such as ISO-8859-1 it breaks outside of my control.

An interesting solution would be adopting libxml2 and re-write FeedKit's parser, but that would be a some heavy development right now. Considering you found an alternative, I'll close this for now.

Thanks