nlbdev / nordic-epub3-dtbook-migrator

Tools for converting between a strict subset of DTBook and EPUB3.
http://nlbdev.github.io/nordic-epub3-dtbook-migrator/
GNU Lesser General Public License v2.1
8 stars 7 forks source link

epubcheck not working in 1.5.0: p:xslt returned non-XML result in epubcheck-report-to-pipeline-report.xsl #512

Closed josteinaj closed 1 year ago

josteinaj commented 2 years ago

When validating EPUBs (both 2015-1 and 2020-1), there's an error when parsing epubcheck reports.

The scripts can still be used if epubcheck validation is disabled.

2020-1 stacktrace:

p:xslt returned non-XML result
    at {http://www.w3.org/ns/xproc}xslt(epub3-validate.step.xpl:130)
    at epub3-validate.step.xpl:111
    at epub3-validate.step.xpl:110
    at epub3-validate.step.xpl:77
    at epub3-validate.step.xpl:73
    at {http://www.daisy.org/ns/pipeline/xproc}nordic-epub3-validate.step.2020-1(epub3-validate.xpl:181)
    at epub3-validate.xpl:176
    at epub3-validate.xpl:116

2015-1 stacktrace:

p:xslt returned non-XML result
    at {http://www.w3.org/ns/xproc}xslt(epub3-validate.step.xpl:120)
    at epub3-validate.step.xpl:101
    at epub3-validate.step.xpl:100
    at epub3-validate.step.xpl:67
    at epub3-validate.step.xpl:63
    at {http://www.daisy.org/ns/pipeline/xproc}nordic-epub3-validate.step(epub3-validate.xpl:125)
    at epub3-validate.xpl:120
    at epub3-validate.xpl:116
martinpub commented 2 years ago

Hi @josteinaj,

Is this when EPUBCheck fails or regardless? And are guidelines-revision/date releases and v1.5.0 using the same version of Pipeline?

josteinaj commented 2 years ago

It fails regardless. It has to do with parsing the report document from epubcheck.

I suspect there's some versioning trouble, yes. Since the tests are successful, but the resulting build fails.

josteinaj commented 2 years ago

@martinpub is this critical to have ready before May 1st? I have a lot on my table and I'm not sure if I'll be able to look at this in time. This could also be something @kalaspuffar could have a look at at some point, if he has time?

(In the future, if we want to use the nordic validator outside of Pipeline 2, we'd have to run epubcheck and ace separately anyway.)

martinpub commented 2 years ago

I see @josteinaj. I think that it simplifies communication if we can say that the Nordic Migrator validation bundles an EPUBCheck validation which is run at the beginning.

I don't know what you say @AndersEkl?

I will check internally if this dev work is something MTM could support, will get back.

AndersEkl commented 2 years ago

If we can't solve this issue before the work samples are sent out to the bidders I think we can instruct them to run EPUBCheck separately.

martinpub commented 2 years ago

If we can't solve this issue before the work samples are sent out to the bidders I think we can instruct them to run EPUBCheck separately.

Yes, it is possible but I still want to exhaust the possibilities of us supporting the fix first. It is much preferable I think. But we'll see. If not possible it should be clearly noted in the release that the functionality is lacking.

AndersEkl commented 2 years ago

Yes, of course.

kalaspuffar commented 2 years ago

@martinpub IMHO bundling them into the migrator would give you more headaches and mix concerns. Then again, writing a small runner that could run each separately and merging the results are straightforward. That is pretty much what our validator and pipeline do.

The issue mentioned here is that when validating any EPUBs with EPUBCheck and if so what version of EPUBCheck are you running? 4.2.5?

josteinaj commented 2 years ago

The JAR file in the Docker image is org.w3c.epubcheck-4.2.2.jar. This is from Pipeline 2 v1.14.1. I don't use the most recent Pipeline 2 since there were some build problems last time I tried.

The pom.xml refers to 4.2.5, so maybe that's why it works when running the tests.

Is this related to 3dc9e482e5b79b30dfa2199234bd960652d6ccfa?

josteinaj commented 2 years ago

I managed to upgrade to Pipeline 2 v1.14.2, which includes epubcheck v4.2.5. So version 1.5.1 of the nordic migrator should hopefully now work. The docker image is still building so I haven't tested it in practice yet.

josteinaj commented 2 years ago

It seems that Pipeline 2 v1.14.2 doesn't have epubcheck v4.2.5 either, it still uses v4.2.2, so updating to that didn't work :(

martinpub commented 2 years ago

@martinpub IMHO bundling them into the migrator would give you more headaches and mix concerns. Then again, writing a small runner that could run each separately and merging the results are straightforward. That is pretty much what our validator and pipeline do.

I see, and yeah, probably in the future we will look into this, but currently we are trying to get the present release working with minimal changes.

This is working in the guidelines-revision branch/unofficial date releases. Or at least it is working in 20210901, which we use in production. It is using Pipeline v1.14.3 I think, as the Docker file defines "latest" for pipeline-assembly, and v.1.14.3 was published in July 2021, and the release after that not until end of September. Did you try building with v.1.14.3 @josteinaj or only with later versions?

josteinaj commented 2 years ago

I think we were building using an older version of Pipeline 2, and creating the Docker image using a newer version of Pipeline 2. We could do that again I suppose, and see if that works. But ideally we should run the tests against the same version of Pipeline 2 that we plan to use in production.

josteinaj commented 2 years ago

I released a Docker-only version (tagged as 1.5.2, but it will show up in the web interface and API as 1.5.2-SNAPSHOT) to Docker Hub. It seems to work.

@martinpub @AndersEkl could you give it a try? For Docker, the 1.5.2 version would have to be used, but the 1.5.1 JAR can be used for other installation methods as long as it's based on Pipeline 2 v1.14.3.

martinpub commented 2 years ago

Wohoo, that's great, thanks @josteinaj! Will test right away and report back.

martinpub commented 2 years ago

@josteinaj I can confirm that the docker release v.1.5.2 is working, including EPUBCheck, great fix.

Should it be clarified in https://github.com/nlbdev/nordic-epub3-dtbook-migrator/releases/tag/v1.5.1 that the Docker image is not working? Maybe remove it from Docker Hub if possible?

Also, will v.1.5.2 have a release page?

josteinaj commented 2 years ago

I haven't planned a release page for 1.5.2, no. We should fix the issue with the Pipeline 2 versions used for building/testing and make a release that properly fixes this.

I haven't updated the updater descriptor yet either, so using the updater mechanism won't work yet.

martinpub commented 2 years ago

OK, sounds good. Will you be able to look into a proper fix?

Currently, it's noted on the web page that only Docker images are available for 1.5.2.

josteinaj commented 2 years ago

I don't know when I'll have time to look at doing a proper fix. From the little I tried it seemed to be changes in Pipeline 2 that caused the errors, so I don't know how much work it will be to fix.

josteinaj commented 2 years ago

I tried looking into a proper fix today (upgrading to modules-parent:1.14.3 in pom.xml) but I'm faced with an error I don't know how to fix. I've created an issue for it here in the official Pipeline 2 repository:

martinpub commented 2 years ago

Thanks for looking into this further @josteinaj!

oscarlcarlsson commented 1 year ago

Epubchecks seem to be working in the latest version.