w3c / epubcheck

The conformance checker for EPUB publications
https://www.w3.org/publishing/epubcheck/
BSD 3-Clause "New" or "Revised" License
1.66k stars 403 forks source link

EPUBCheck crash with StackOverflowError on Windows 10 #940

Closed brianrusnica closed 5 years ago

brianrusnica commented 5 years ago

I recently tested EPUBCHECK 4.2.0 Alpha 1 on my Windows 10 machine and received the attached errors when running EPUBCHECK. I realize this is a slightly outdated version of Win 10 (due to corporate infrastructure management) but it ran all previous EPUBCHECK versions w/o issue.

Machine Specs:

To Repro:

Expected:

Actual:

rdeltour commented 5 years ago

Thanks for the report! Would you be able to share the EPUB which made it crash? (If it's copyrighted material, you can send it privately over email and/or we could sign an NDA if needed).

garconvacher commented 5 years ago

Same here with W10 pro 1803 (build 17134.523) and Java 9 (1.8.0_191-b12) with all EPUB (Moby Dick include)

brianrusnica commented 5 years ago

Thanks @rdeltour - as @garconvacher mentions, it appears to happen with all EPUBs I've tried, including a version of Moby Dick I just DL'd from this site: http://www.gutenberg.org/ebooks/2701

mattgarrish commented 5 years ago

I'm running the same Windows setup as @garconvacher but with Java 10.0.1 (JRE 18.3) and not having any issues.

TzviyaSiegman commented 5 years ago

I would love to test it but I can't even download it for "security" reasons. :(

rdeltour commented 5 years ago

@brianrusnica ok, thanks for the clarification. I'll try to reproduce on my Windows box. In the mean time, could you please tell me which version of Java you're using? It might be the reason why it's working on Matt's setup and not yours…

brianrusnica commented 5 years ago

@rdeltour sure thing, it is included in my original ticket. Java Version 8 Updated 45 (build 1.8.0_45-b14)

rdeltour commented 5 years ago

sure thing, it is included in my original ticket

doh, of course 🤦‍♂️ 😄

mattgarrish commented 5 years ago

If it helps, I tried downgrading to Java 1.8.0_45-b14 and I get the same behaviour. The latest Java 8 release runs fine, though. Definitely something in Java changed, but don't look to me to figure out what... :)

rdeltour commented 5 years ago

@brianrusnica @garconvacher could you please retry running EPUBCheck with the options -Xss512k added to the java command?

For instance:

> java -Xss512k -jar epubcheck.jar ..\moby-dick.epub

It seems to be a limitation of Jing (the RelaxNG engine), and the Nu HTML Checker recommends adjusting the thread stack size in that manner to work around it.

If you confirm it works, we could document that somewhere (or see if a better default can be set programmatically).

Edited: to remove the -XX:ThreadStackSize=2048 option recommendation, as -Xss and -XX:ThreadStackSize have the same functionality

rdeltour commented 5 years ago

(btw, I could reproduce the issue locally, using an up-to-date Windows 10 version 1803 and Java 1.8.0_161. Using the options mentioned did the trick in my case.)

rdeltour commented 5 years ago

Well, as mentioned in the documentation of the stackSize option of Java's Thread constructor, this configuration really belongs to the JRE:

Due to the platform-dependent nature of the behavior of this constructor, extreme care should be exercised in its use. The thread stack size necessary to perform a given computation will likely vary from one JRE implementation to another. In light of this variation, careful tuning of the stack size parameter may be required, and the tuning may need to be repeated for each JRE implementation on which an application is to run.

rdeltour commented 5 years ago

I added a new "Troubleshooting" section to the command line documentation.

I will close this issue once I get confirmation this solution works for other.

garconvacher commented 5 years ago

@rdeltour java -Xss512k work fine!

brianrusnica commented 5 years ago

@rdeltour adding the "-Xss512k" option worked for me as well - I was able to run EPUBCHECK without any further issue. Does this mean I should use this option for all future runs as well?

rdeltour commented 5 years ago

Thank you @garconvacher and @brianrusnica for the confirmation.

Does this mean I should use this option for all future runs as well?

Yes, correct. The new schemas are apparently a bit more demanding on the thread stack size, and unless we bundle a launching script with EPUBCheck in the future, users will have to set the option themselves.

Also, the issue may not show up on more recent JVMs, which likely use a higher default thread stack size. There might be a difference between 32bits and 64bits JVMs.

rdeltour commented 5 years ago

Closing this issue now. Feel free to continue the discussion on the comments, and thanks again for the report and feedback!

Wim-Stijnman commented 1 year ago

Encountered the same problem at june 16 2023 on a new windows 11 64 bit machine with all up-to-date software. Applied suggested change in epubcheck file plugin.py like this:

#----------------------------------------------------------------------
# define epubcheck command line parameters
#----------------------------------------------------------------------
if is32bit:
    args = [java_path, '-Dfile.encoding=UTF8', '-Xss1024k', '-jar', epc_path, epub_path, '-q', '--json', '-']
else:
    args = [java_path, '-Dfile.encoding=UTF8', '-Xss1024k', '-jar', epc_path, epub_path, '-q', '--json', '-']

Epubcheck now runs OK.

jszabo98 commented 1 year ago

-Xss512k also needed in openjdk 1.8.0_242 in windows 10 21h1. I would put that in the readme. It crashes around 196000k. Actually it was the 32-bit java, and the 64-bit java worked perfectly.