WebCuratorTool / webcurator

The root of the webcurator tool project, containing all modules needed to run a fully functional webcurator tool.
Apache License 2.0
2 stars 1 forks source link

Harvested crawls not displayed #67

Closed troloff closed 2 years ago

troloff commented 2 years ago

I'm currently setting up WebCurator and have a test environment ready.

Thanks to your great tutorial, I've set up the first crawls. After some time, if I come back to dashboard, it reports 4 target instances "ready for review": dashboard

However, if I click on "harvested", only the first one is displayed: Target Instances_1

I checked the WCT Store directory, and there are all the crawls I started: root@webarchive:/usr/local/wct/store# find . -name *.warc ./12/1/IAH-20220328150009474-00000-813~webarchive.net.adk.de~8443.warc ./33/1/IAH-20220325105013639-00000-158564~webarchive.net.adk.de~8443.warc ./29/1/IAH-20220325103842200-00000-158564~webarchive.net.adk.de~8443.warc ./37/1/IAH-20220325135817400-00000-158564~webarchive.net.adk.de~8443.warc ./26/1/IAH-20220330125535725-00000-813~webarchive.net.adk.de~8443.warc ./28/1/IAH-20220325103517901-00000-158564~webarchive.net.adk.de~8443.warc ./31/1/IAH-20220330154537316-00000-813~webarchive.net.adk.de~8443.warc ./35/1/IAH-20220330164504871-00000-813~webarchive.net.adk.de~8443.warc

If I enter another crawl ID manually in the search form, the respective crawl is displayed (url is clickable and shows details about the crawl): Target Instances_2

I would expect all finished crawls to be displayed in the table. What am I doing wrong here?

Thanks, Torsten

troloff commented 2 years ago

The page doesn't seem to get rendered properly, html just stops at

<td class="[tableRowLite]()">28/03/2022 17:00:01</td>
<td class="[tableRowLite]()">Harvested</td>
<td class="[tableRowLite]()">T. Roloff</td>
<td class="[tableRowLite]()">00:01:08:10&nbsp;</td>
<td class="[tableRowLite]()">54,22 MB&nbsp;</td>
<td class="[tableRowLite]()">3795&nbsp;</td>

HTML source ends here.

I found something in the logs: 2022-03-31 13:33:40.297 +0200 ERROR [http-nio-8080-exec-86] o.a.j.l.DirectJDKLog (DirectJDKLog.java:175) - Servlet.service() for servlet [dispatcherServlet] in context with path [/wct] threw exception [Request processing failed; nested exception is org.apache.tiles.request.render.CannotRenderException: JSPException including path '/jsp/qa-queue.jsp'.] with root cause java.lang.NumberFormatException: For input string: "0,0"

Seems to be an i18n issue? I'll look into /jsp/qa-queue.jsp to see if I can find the value causing trouble.

troloff commented 2 years ago

Ok, managed to fix it on my own: I had a locale de_DE set on my server, and the wct-webapp service was started with this locale. Now I'm passing "LANG=en_US" before starting the service, and everything seems to be ok.

Problem was wrong locale when testing a number (line 896 of jsp/qa-queue.jsp): <c:when test="${intFailed < 100.0}"> <------- this fails when wct-webapp service is started with "wrong" locale

Strange enough, as far as I understand locale is set on line 739 with <fmt:setLocale value="en_GB" /> but seems to be ignored?