Open josteinaj opened 8 months ago
Is it a bug in epubcheck? Or in Pipeline? Will -Xss4096k
resolve the issue in all cases, or does it depend on the size of the input EPUB?
I'd say it's a bug in epubcheck. The exception occurs when the NCX is really big. I think a linked list is used internally in epubcheck to represent the reading order when parsing the NCX, and Java hits some recursion limit when -Xss
is too low.
I bet you could construct an EPUB that breaks epubcheck even at -Xss4096k
. But since NCX is deprecated in EPUB, and this many entries in an NCX is rare, I suppose it might not be worth the development effort to fix the epubcheck bug. In practice, this could probably be solved by increasing -Xss
. I don't know the downsides of increasing -Xss
, but just increasing it to -Xss2048k
is probably enough.
I read somewhere that a downside is that if you have a lot of threads, a large -Xss
value could result in a OutOfMemoryError.
@rdeltour What do you think? A good idea to change the default value, or leave it up to users to set the JAVA_OPTS
environment variable?
@josteinaj the recursive algo that would causing OOME was replace by an iteration, so that particular issue should have beenfixed since v5.0.0 (see https://github.com/w3c/epubcheck/pull/1434). Do you still encounter OOME with recent versions of EPUBCheck, and were you able to clearly identify this an issue with NCX reading order checks? If yes, would you mind opening an issue on w3c/epubcheck
?
I read somewhere that a downside is that if you have a lot of threads, a large -Xss value could result in a OutOfMemoryError.
I wasn't aware of that. Do you think our use of threading would make that an issue for the Pipeline?
@rdeltour What do you think? A good idea to change the default value, or leave it up to users to set the JAVA_OPTS environment variable
If increasing the default reduces the probability of memory issues, and can still be reverted by the user (with manual configuration), I think it does make sense.
Do you think our use of threading would make that an issue for the Pipeline?
We don't create that much threads, so I don't think so.
I think it's fine to add the -Xss
. But perhaps we need to revert it when we have updated to epubcheck 5.0.0?
I have not tried with epubcheck v5.0.0. We are setting -Xss on our end now, so this is not an issue for us anymore.
I faced this epubcheck issue, which Romain described a solution for: https://github.com/w3c/epubcheck/issues/1094
It failed with
-Xss1024k
, which is the default, but it succeeded with-Xss2048k
. I've now set JAVA_OPTS to-Xss4096k
for all our Pipeline 2 instances, just to be safe.Could the default be increased? I don't know what the best way would be. Maybe set it in
DEFAULT_JAVA_OPTS
inbin/pipeline2
?