geometer / FBReaderJ

Official FBReaderJ project repository
http://www.fbreader.org/FBReaderJ/
1.83k stars 802 forks source link

FBReaderJ crashes on non-ISO 8859-1 epub's with JPG cover images #74

Closed dennis-sheil closed 11 years ago

dennis-sheil commented 11 years ago

The crash happens on the ice-cream-sandwich branch.

When the Android device or emulator is in a non-ISO 8859-1 language, and is instead in say a ISO 8859-2 language such as Hungarian or Czech, and you are reading an epub with a cover image in one of those non-ISO 8859-1 languages, such as Hungarian or Czech, the program crashes. It crashes, because when the code is dealing with the JPG cover image embedded within the epub, it does a check to see if character encoding is windows-1252. The check assumes the character encoding is ISO 8859-1 before the check even completes, and crashes if it is not so. I will post about that in subsequent comments.

This is on the ice-cream-sandwich branch, in a Jellybean (4.1) emulator in Czech language mode. I downloaded one of the epub books the app can not handle - R.U.R. by Karel Čapek. You can download that epub here ( http://www.gutenberg.org/ebooks/13083.epub ).

Czech epub's without a JPEG cover image loaded fine. Ones like this with a JPEG cover image crashed during the load though.

Also, Hungarian epub's with JPEG cover images crash the app in similar circumstances. The emulator set to Hungarian language, the epub loaded being a Hungarian language (Magyar) book with a JPG cover image in the epub. It will crash as well.

Here is the crash method as I try to load the book up. The crash is repeatable, I've done it several times:

W/System.err( 588): using plugin: ePub/NATIVE

W/dalvikvm( 588): JNI WARNING: input is not valid Modified UTF-8: illegal start byte 0xff

W/dalvikvm( 588): string: 'ÿØÿà'

W/dalvikvm( 588): in Lorg/geometerplus/fbreader/formats/NativeFormatPlugin;.readModelNative:(Lorg/geometerplus/fbreader/bookmodel/BookModel;)Z (NewStringUTF)

I/dalvikvm( 588): "Thread-80" prio=5 tid=11 NATIVE

I/dalvikvm( 588): | group="main" sCount=0 dsCount=0 obj=0x41565ff0 self=0x2a314ad8

I/dalvikvm( 588): | sysTid=664 nice=0 sched=0/0 cgrp=apps handle=707939960

I/dalvikvm( 588): | schedstat=( 288679729 2049960715 86 ) utm=24 stm=4 core=0

I/dalvikvm( 588): #00 pc 00001260 /system/lib/libcorkscrew.so (unwind_backtrace_thread+27)

I/dalvikvm( 588): #01 pc 0005f664 /system/lib/libdvm.so (dvmDumpNativeStack(DebugOutputTarget const*, int)+35)

I/dalvikvm( 588): #02 pc 00053518 /system/lib/libdvm.so (dvmDumpThreadEx(DebugOutputTarget const, Thread, bool)+303)

I/dalvikvm( 588): #03 pc 000535b2 /system/lib/libdvm.so (dvmDumpThread(Thread*, bool)+25)

I/dalvikvm( 588): #04 pc 00038cfa /system/lib/libdvm.so

I/dalvikvm( 588): #05 pc 0003a0ac /system/lib/libdvm.so

I/dalvikvm( 588): #06 pc 0003c30a /system/lib/libdvm.so

I/dalvikvm( 588): #07 pc 000377d8 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (AndroidUtil::createJavaString(_JNIEnv*, std::string const&)+15)

I/dalvikvm( 588): #08 pc 00046b50 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLUnicodeUtil::toLower(std::string const&)+35)

I/dalvikvm( 588): #09 pc 00047ea8 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLXMLReader::readDocument(shared_ptr)+195)

I/dalvikvm( 588): #10 pc 00047f9c /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLXMLReader::readDocument(ZLFile const&)+15)

I/dalvikvm( 588): #11 pc 0005e658 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (XHTMLImageFinder::readImage(ZLFile const&)+87)

I/dalvikvm( 588): #12 pc 0005baba /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (OEBBookReader::startElementHandler(char const, char const*)+1777)

I/dalvikvm( 588): #13 pc 00048f72 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLXMLReaderInternal::fStartElementHandler(void, char const, char const**)+309)

I/dalvikvm( 588): #14 pc 00079402 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so

I/dalvikvm( 588): #15 pc 00079b64 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so

I/dalvikvm( 588): #16 pc 000788e6 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so

I/dalvikvm( 588): #17 pc 00078e12 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so

I/dalvikvm( 588): #18 pc 0007aaf2 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (XML_ParseBuffer+57)

I/dalvikvm( 588): #19 pc 00048012 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLXMLReaderInternal::parseBuffer(char const*, unsigned int)+5)

I/dalvikvm( 588): #20 pc 00047f14 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLXMLReader::readDocument(shared_ptr)+303)

I/dalvikvm( 588): #21 pc 00047f9c /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (ZLXMLReader::readDocument(ZLFile const&)+15)

I/dalvikvm( 588): #22 pc 0005ae8a /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (OEBBookReader::readBook(ZLFile const&)+189)

I/dalvikvm( 588): #23 pc 0005dd86 /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (OEBPlugin::readModel(BookModel&) const+57)

I/dalvikvm( 588): #24 pc 00036f4a /data/data/org.geometerplus.zlibrary.ui.android/lib/libNativeFormats-v2.so (Java_org_geometerplus_fbreader_formats_NativeFormatPlugin_readModelNative+149)

I/dalvikvm( 588): #25 pc 0001de30 /system/lib/libdvm.so (dvmPlatformInvoke+112)

I/dalvikvm( 588): #26 pc 0004ce66 /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const, JValue, Method const, Thread)+389)

I/dalvikvm( 588): #27 pc 00038d84 /system/lib/libdvm.so (dvmCheckCallJNIMethod(unsigned int const, JValue, Method const, Thread)+7)

I/dalvikvm( 588): #28 pc 00027260 /system/lib/libdvm.so

I/dalvikvm( 588): #29 pc 0002bb2c /system/lib/libdvm.so (dvmInterpret(Thread, Method const, JValue*)+180)

I/dalvikvm( 588): #30 pc 0005f590 /system/lib/libdvm.so (dvmCallMethodV(Thread, Method const, Object, bool, JValue, std::__va_list)+271)

I/dalvikvm( 588): #31 pc 0005f5ba /system/lib/libdvm.so (dvmCallMethod(Thread, Method const, Object, JValue, ...)+19)

I/dalvikvm( 588): at org.geometerplus.fbreader.formats.NativeFormatPlugin.readModelNative(Native Method)

I/dalvikvm( 588): at org.geometerplus.fbreader.formats.NativeFormatPlugin.readModel(NativeFormatPlugin.java:63)

I/dalvikvm( 588): at org.geometerplus.fbreader.formats.oeb.OEBNativePlugin.readModel(OEBNativePlugin.java:47)

I/dalvikvm( 588): at org.geometerplus.fbreader.bookmodel.BookModel.createModel(BookModel.java:47)

I/dalvikvm( 588): at org.geometerplus.fbreader.fbreader.FBReaderApp.openBookInternal(FBReaderApp.java:259)

I/dalvikvm( 588): at org.geometerplus.fbreader.fbreader.FBReaderApp$1.run(FBReaderApp.java:157)

I/dalvikvm( 588): at org.geometerplus.android.util.UIUtil$3$1.run(UIUtil.java:120)

I/dalvikvm( 588):

E/dalvikvm( 588): VM aborting

F/libc ( 588): Fatal signal 11 (SIGSEGV) at 0xdeadd00d (code=1), thread 664 (Thread-80)

I/DEBUG ( 34): * * * * * * * * * * * * * * * *

I/DEBUG ( 34): Build fingerprint: 'generic/sdk/generic:4.1/JRN83C/391408:eng/test-keys'

I/DEBUG ( 34): pid: 588, tid: 664, name: UNKNOWN >>> org.geometerplus.zlibrary.ui.android <<<

I/DEBUG ( 34): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr deadd00d

dennis-sheil commented 11 years ago

With the emulator set to the Hungarian language, Magyar, I see the same behavior. Hungarian epub's without a JPEG cover image within the epub file are OK, but if the epub does contain that FBReaderJ will crash. An example of one it crashes on - the book "A bihari remete" ( http://www.gutenberg.org/ebooks/20169.epub ).

dennis-sheil commented 11 years ago

So investigating more what this line was:

W/dalvikvm( 705): string: 'ÿØÿà'

I see that 'ÿØÿà', or in hex, FF D8 FF E0, is where the JPG image in both crashing files start. In the epub's that work, there are no images, the epub's crashing all seem to have cover images. JFIF-standard (e.g. JPEG) files all begin with the hex values FF D8 FF E0, which is the string the program is continually crashing on.

dennis-sheil commented 11 years ago

I set my emulator mode to English and have been opening English language epub's with JPEG cover images ( http://www.gutenberg.org/ebooks/11.epub ) without a problem. So this seems to only affect epub's with JPG cover images in certain languages - such as Czech and Hungarian, especially when the Android device/emulator is set in that language.

dennis-sheil commented 11 years ago

I made a very rough kludge to prevent the crash, the kludge being to skip the Windows-1252 character encoding check in the readDocument method of ZLXMLReader. Within the readDocument method, I commented out the entire if statement that tests if index is greater than zero.

As this pertains to iso-8859-1, I think somewhere in the code it is being assumed that it is being fed iso-8859-1 data. When in fact, in this case, it is being fed iso-8859-2 data (Czech, Hungarian...I assume Polish would fail as well).

Anyhow, the scope of the problem is being narrowed down. I'm not completely sure what the purpose of all of this code is, but some of it becomes clearer as I read through it.

geometer commented 11 years ago

Hi,

FBReader does not fail on the book from the first message. Could you please test my newest build? (Please let me know where to send the apk.)

Regards,

-- Nikolay

geometer commented 11 years ago

Does not fail on my device, I mean.

dennis-sheil commented 11 years ago

I am doing it against the latest ice-cream-sandwich branch build (September 17th 6790a1937165210230a6ac038ed23b286328b2e6 ). I don't see any pushes after three days ago to any branch.

As I said, my language settings were to Czech on the emulator. I don't know if that is a necessary component to it.

The code doesn't look like it's acting correctly. It is fed a JPG image, and then checks if the JPG is Windows-1252 encoded. Then it fails if the book and/or device encoding is not in ISO 8859-1.

I improved my kludge a little bit. In XHTMLImageFinder::readImage I now send a boolean parameter to the ZLXMLReader::readDocument method that I am looking for an image. If that boolean is set as true, I skip the Windows-1252 test. It is kludgey but solves the problem for me - Windows-1252 is still checked for everything except images, which don't need to be checked as such.

So the problem is now fixed for me in my codebase. My solution is probably kludgey and not elegant though.

As I said, I'm using a Jellybean emulator in Czech language mode. That is what crashes. The Hungarian books with cover images crash when I'm in Hungarian language mode. I have a Samsung ICS tablet, but it does not have an ISO 8859-2 language such as Hungarian or Czech as a language option.

dennis-sheil commented 11 years ago

Regarding "Please let me know where to send the apk", you can e-mail me if you have any builds to send. Thanks.

geometer commented 11 years ago

Ok, I'll try to understand your comments tomorrow. :) Anyway please test the build I sent. I do not really expect it will work, but... :)

geometer commented 11 years ago

Fixed in 1.6