freme-project / e-Publishing

Apache License 2.0
0 stars 0 forks source link

Generating e-Pub fails on remote server #19

Open ghsnd opened 9 years ago

ghsnd commented 9 years ago

When sending the file files.zip to the e-Publishing service, the resulting epub file is not complete. In the epub, only a part of the file OEPBS/content.opf is written.

on the FREME server, it works on version 0.4 of the Broker, but not on the dev. version.

When running on a local machine there is no problem using the dev version; we cannot reproduce it.

The command is

curl --form "htmlZip=@files.zip" --form metadata='{"titles":["Semantic Book"],"creators":[{"firstName": "Joske","lastName": "Vermeulen","roles":["author"]}],"subjects":["news","world"],"language":"en","identifier":{"value":"urn:1235-568-78910"},"tableOfContents":[{"title":"Introduction","resource":"introduction.html"},{"title":"Etymology","resource":"etymology.html"},{"title":"History","resource":"history.html"}]}' http://api-dev.freme-project.eu/current/e-publishing/html > semantic_book.epub

The epub file is just a zip file, so it is easy to see what's in it.

So the first questions ( @jnehring ) are:

jnehring commented 9 years ago

is there anything changed on the FREME server that possibly limits the size of the response?

I have configured very large request size limits, I think 2 GB. But I never tested large requests. How large is your request?

is it possible for us to see logfiles on the dev server (can we get a limited account, or are the logs published somewhere)?

I send you credentials. You find freme-dev log files in /opt/freme/logs

what exact JVM is used?

$ java -version java version "1.8.0_66" Java(TM) SE Runtime Environment (build 1.8.0_66-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

It is the same on freme-dev and freme-live. Does this answer your question?

ghsnd commented 9 years ago

Yes, thanks.

The file is only about 2 MB, so that won't be the problem.

jnehring commented 8 years ago

In the broker.log you can find

ERROR   2015-12-09 15:51:54,582 [http-nio-8084-exec-1] nl.siegmann.epublib.epub.Epub3Writer  -
org.xml.sax.SAXParseException; lineNumber: 150; columnNumber: 23; The entity "nbsp" was referenced, but not declared.
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
        at nl.siegmann.epublib.epub.Epub3PackageDocumentWriter.writeItem(Epub3PackageDocumentWriter.java:151)
        at nl.siegmann.epublib.epub.Epub3PackageDocumentWriter.writeManifest(Epub3PackageDocumentWriter.java:95)
        at nl.siegmann.epublib.epub.Epub3PackageDocumentWriter.write(Epub3PackageDocumentWriter.java:51)
        at nl.siegmann.epublib.epub.Epub3Writer.writePackageDocument(Epub3Writer.java:72)
        at nl.siegmann.epublib.epub.Epub3Writer.write(Epub3Writer.java:45)
        at eu.freme.eservices.epublishing.EPubCreatorImpl.onEnd(EPubCreatorImpl.java:262)
        at eu.freme.eservices.epublishing.EPublishingService.createEPUB(EPublishingService.java:67)
        at eu.freme.broker.eservices.EPublishing.htmlToEPub(EPublishing.java:71)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62

Unfortunately this is not new information to @pheyvaer

pheyvaer commented 8 years ago

@jnehring As discussed at the F2F, you were going to check this, right?

jnehring commented 8 years ago

Thanks for the reminder. I will take a look at the issue.

jnehring commented 8 years ago

I tried to reproduce the bug. I used your CURL request and your files.zip. I cannot see any error messages in the logfile on freme-dev. Also I can open the ebook with Readium. The file OEPBS/content.opf looks like a valid XML file to me.

I renamed content.opf to content.txt and attached it to this issue. @pheyvaer can you please check if the file is ok?

content.txt

pheyvaer commented 8 years ago

It just happened again ...

pheyvaer commented 8 years ago

But now it also happens locally.

jnehring commented 8 years ago

I could not reproduce it on my windows 8 machine.

pheyvaer commented 8 years ago

It might be caused by a specific html file on our side. We are looking into it.

pheyvaer commented 8 years ago

UPDATE: so it works with the example at the beginning of this issue. However, with our current example (the Semantic Book) it doesn't. And there is not error or what so ever, the contents.opf is just incomplete.

curl --form "htmlZip=@files.zip" --form metadata='{"titles":["Semantic Book"],"creators":[{"firstName": "Frank","lastName": "Salliau","roles":["author"]}],"subjects":["news","world"],"language":"en","identifier":{"value":"urn:1235-568-78910"},"tableOfContents":[{"title":"Search","resource":"search.xhtml"},{"title":"Introduction","resource":"introduction.html"},{"title":"Etymology","resource":"etymology.html"},{"title":"History","resource":"history.html"},{"title":"Geography","resource":"geography.html"},{"title":"Administration","resource":"administration.html"},{"title":"Cityscape","resource":"cityscape.html"},{"title":"Economy","resource":"economy.html"},{"title":"Demographics","resource":"demographics.html"},{"title":"Culture and Contemporary Life","resource":"culture_and_contemporary_life.html"},{"title":"Education","resource":"education.html"},{"title":"Environment","resource":"environment.html"},{"title":"Olympic Games","resource":"olympic_games.html"},{"title":"Special Olympics","resource":"special_olympics.html"},{"title":"International Relations","resource":"international_relations.html"},{"title":"Other Locations Named After Athens","resource":"other_locations_named_after_athens.html"},{"title":"See Also","resource":"see_also.html"},{"title":"References","resource":"references.html"},{"title":"External Links","resource":"external_links.html"}]}'  http://http://api-dev.freme-project.eu/current/e-publishing/html > semantic_book.epub

files.zip

pheyvaer commented 8 years ago

OK, so it fails if one of the submitted files is not valid.

jnehring commented 8 years ago

Maybe we can add some validation and return an error message instead of invalid ebook?