kiwix / kiwix-apple

Kiwix for offline access on iOS and macOS
https://apple.kiwix.org
GNU Lesser General Public License v3.0
438 stars 70 forks source link

zim file open incorrectly on iOS 9 and 11 #73

Closed Popolechien closed 6 years ago

Popolechien commented 6 years ago

I'm copy-pasting the exchange I had with the user (+screenshot):

"I have tried on an iPad 2 running 9.3.5 (downloaded with jailbreak running) and I cannot open the zim file I downloaded from the app. It did initially open but it looked like it was the css code , I did reboot into 9.35 no jailbreak and it simply does not open. Would this be because I am using a 32 bit device perhaps ? I have not attempted this on an iPhone 6 with current iOS 11 but will try later I'm sure. It will not download any other files and just crashes although I have several gb free space."

And then...

"Well I transferred the Wikipedia zim file to an iPhone 6 updated to the new iOS and see the same as what I saw once on the iPad ...." bug

automactic commented 6 years ago

Which zim file are you using?

Popolechien commented 6 years ago

Here's the reply (if it makes sense - if not I'll ask to clarify):

"The question of what file maybe the issue. I did also download the small 6mb dictionary which I think works fine, on iPhone 6 iOS 11. The original was the Wikipedia 0.8 feb 15,2011 .
Observation shows also an icon beside the Wiktionary (Jan 15,2017 ), but not the Wikipedia .. On the iPhone Local folder I cannot download any to the iPad 2, iOS 9.3.5 as it still crashes when I try now, I did download the original file onto it. I haven't had the chance to copy that ,6mb dictionary over via iTunes to the iPad.
Possible break in a larger download ?"

automactic commented 6 years ago

Without knowing the name of the specific zim file in question, we cannot further investigate. Also please try re-download the latest version of Kiwix from App Store on the device running iOS 11.

Popolechien commented 6 years ago

File name is wikipedia_en_wp1-0.8_orig_2010-12.zim (waiting for feedback from re-download)

Popolechien commented 6 years ago

"Both devices have reinstalled.
On iPad 2,2 iOS 9.35 won’t open files but downloaded them. On iPhone 6 iOS 11. Has downloaded three different ones and all worked. ( I’ll note the “browse” dictionary by letter did not work but that’s about as far as I went after downloading the three. )"

automactic commented 6 years ago

This is likely an zim file issue. Closed due to lack of activity.

kelson42 commented 6 years ago

@automactic Can you confirm it works for you?

automactic commented 6 years ago

This issue is not easy to be reproduced for me, since I have no idea where I should get the zim in question (wikipedia_en_wp1-0.8_orig_2010-12.zim)

Also the user mentioned:

On iPhone 6 iOS 11. Has downloaded three different ones and all worked.

Also 9.3.5 is no longer supported.

I don't see a point of keeping this open.

kelson42 commented 6 years ago

@automactic We need at least to verify the newest version of Kiwix does not have the problem. The file is available here https://download.kiwix.org/zim/wikipedia/wikipedia_en_wp1-0.8_orig_2010-12.zim

kelson42 commented 6 years ago

I confirme that the pb is still there with 1.9

automactic commented 6 years ago

I did some investigation. I found this problem is caused by string encoding.

The GET request of main page of wikipedia_en_wp1-0.8_orig_2010-12.zim is as follows: URL: kiwix://4dd18dea-9afd-080a-50c6-8d62072f5675/A/Main%20Page.html MIME: text/html; charset=utf-8 Data Decoded as UTF8 by the web view: data.txt

If you take a look at, for example, line 77 of data.txt. You can find strings like –, which is not UTF-8 encoding and more like iso-8859-1 encoding. (reference) And since the MIME explicitly said this is utf-8 encoded, it's likely the web view refused to load the page.

kelson42 commented 6 years ago

@automactic What is the exact problem? What should be delivered instead of that?

automactic commented 6 years ago

@kelson42 I do not know

automactic commented 6 years ago

@kelson42 Another interesting thing I found is, pretty much in all other zim files, the mime of a html would be text/html, not containing the charset part. So I forced the content type to also be just text/html for all htmls in the problematic zim file. The page loads!

Also, I noticed, the htmls in the problematic zim file have <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">, whereas other normal zim files all start with a <html> tag. So it could be a html5 / 4 thing also.

kelson42 commented 6 years ago

@automactic I'm not surprised about the problem with the mime-type. If it works well with your approach, then this was probably the best thing to do :) Thx. We can close the ticket then I think.

automactic commented 6 years ago

Are you saying I should always replace text/html; charset=utf-8 with text/html in the mime? I don't think we should.

kelson42 commented 6 years ago

@automactic You have a better alternative?

automactic commented 6 years ago

No, but we should not having this kind of hardcoding in the app. What if someday we do indeed need text/html; charset=utf-8 to correctly display the articles?

kelson42 commented 6 years ago

@automactic As far as I know text/html; charset=utf-8 is a correct mime-type. So, there is no reason to avoid to get it in the ZIM or to get the libzim/libkiwix "fix it". The problem seems that for some reason iOS HTML viewer has a problem with it... so this needs to be fix in iOS app... This is so far how I see things.

automactic commented 6 years ago

Mmm, yes. This does seems to be an iOS only issue.

I have also tested setting all html mime as text/html; charset=utf-8 in other zim files, all loading failed and is showing the plain HTML code. 🤦🏻‍♂️

Will add the fix in the next beta.

kelson42 commented 6 years ago

As far I can see this bug is fixed in 1.9 (7) 👍

kelson42 commented 6 years ago

@automactic Sorry, but the bug seems to have reappeared (both for ICS and Wikipedia 0.8 file).

automactic commented 6 years ago

OK, I think I have found a better fix for this issue. Instead of using the designated initializer of URLResponse, which is a more convenient way of assemble a response, I now use HTTPURLResponse, a subclass of URLResponse, to directly set headers, body, etc.

Please check in 1.9 (11)

kelson42 commented 6 years ago

Works again