Closed GoogleCodeExporter closed 9 years ago
Per team discussion 6/1/09, will be in the next release.
Original comment by rc...@library.rochester.edu
on 1 Jun 2009 at 3:58
When a harvester tries to harvest the records from the XC's OAI Toolkit server,
and
if that server's mySQL connection was idle for more than 2 hours, we observe
that
the harvest fails at the first time as mySQL throws a "Communications Link
Failure"
exception.
This causes the XC's OAI Toolkit to return a null resumption token, and
therefore
then the harvest stops.
Although, when you try to harvest immediately after that for the second time,
the
harvest runs properly and successfully. This complements that the mySQL
connection
was now active, and therefore it works well.
Original comment by sva...@library.rochester.edu
on 2 Jun 2009 at 2:10
A detailed real-time error one would get when trying to harvest would be
something
like as:
These logs are generated when the MST Harvester was trying to harvest from OAI
Toolkit.
I scheduled a harvest with "oai_dc" format and "Bibliographic records" as
set. It harvested 10000 records. But I see following exception in HarvestIn
log.
26 May 2009 09:55:48,312 DEBUG [Thread-18] - doHarvest() from: null
26 May 2009 09:55:48,312 DEBUG [Thread-18] - harvestAll is true
26 May 2009 09:55:48,343 DEBUG [Thread-18] - A request for ListRecords has
been made. Establishing connection with the data provider...
26 May 2009 09:55:48,343 DEBUG [Thread-18] - Sending the OAI request:
http://128.151.244.132:8080/OAIToolkit/oai-request.do?
verb=ListRecords&metadataPrefix=oai_dc&set=bib
26 May 2009 09:55:56,796 INFO [Thread-18] - Time taken to get a response
from the server 8453
26 May 2009 09:55:56,796 INFO [Thread-18] - Total time taken to get a
response from the server 8453
26 May 2009 09:56:30,875 INFO [Thread-18] - Time taken between placing the
request and begining to process the reply 34079
26 May 2009 09:57:26,343 INFO [Thread-18] - The resumption string is null|5000
26 May 2009 09:57:26,343 INFO [Thread-18] - Time to clear the resumption
token 55468
26 May 2009 09:57:26,343 INFO [Thread-18] - Total time to clear resumption
tokens 55468
26 May 2009 09:57:26,343 DEBUG [Thread-18] - A request for ListRecords with
resumptionToken null|5000 has been made. Establishing connection with the
data provider...
26 May 2009 09:57:26,343 DEBUG [Thread-18] - Sending the OAI request:
http://128.151.244.132:8080/OAIToolkit/oai-request.do?
verb=ListRecords&resumptionToken=null%7C5000
26 May 2009 09:57:26,859 INFO [Thread-18] - Time taken to get a response
from the server 516
26 May 2009 09:57:26,859 INFO [Thread-18] - Total time taken to get a
response from the server 8969
26 May 2009 09:58:03,750 INFO [Thread-18] - Time taken between placing the
request and begining to process the reply 36891
26 May 2009 09:58:58,500 INFO [Thread-18] - The resumption string is
null|10000
26 May 2009 09:58:58,500 INFO [Thread-18] - Time to clear the resumption
token 54750
26 May 2009 09:58:58,500 INFO [Thread-18] - Total time to clear resumption
tokens 110218
26 May 2009 09:58:58,500 DEBUG [Thread-18] - A request for ListRecords with
resumptionToken null|10000 has been made. Establishing connection with the
data provider...
26 May 2009 09:58:58,500 DEBUG [Thread-18] - Sending the OAI request:
http://128.151.244.132:8080/OAIToolkit/oai-request.do?
verb=ListRecords&resumptionToken=null%7C10000
26 May 2009 10:03:59,078 ERROR [Thread-18] - Error getting the HTML document.
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
at
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionA
dapte
r.readLine(MultiThreadedHttpConnectionManager.java:1413)
at
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:
1973)
at
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:17
35)
at
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
at
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry
(HttpMethodDirector.java:398)
at
org.apache.commons.httpclient.HttpMethodDirector.executeMethod
(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod
(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod
(HttpClient.java:323)
at xc.mst.harvester.Harvester.getDoc(Harvester.java:1219)
at xc.mst.harvester.Harvester.doHarvest(Harvester.java:641)
at xc.mst.harvester.Harvester.harvest(Harvester.java:343)
at xc.mst.harvester.HarvestRunner.runHarvestStep(HarvestRunner.java:227)
at xc.mst.harvester.HarvestRunner.runHarvest(HarvestRunner.java:163)
at xc.mst.scheduling.HarvesterWorkerThread.run(HarvesterWorkerThread.java:57)
26 May 2009 10:03:59,281 ERROR [Thread-18] - A Hexeption occurred while
harvesting http://128.151.244.132:8080/OAIToolkit/oai-request.do
xc.mst.harvester.Hexception: The request for data resulted in an invalid
response from the provider. Error: Read timed out
at xc.mst.harvester.Harvester.getDoc(Harvester.java:1308)
at xc.mst.harvester.Harvester.doHarvest(Harvester.java:641)
at xc.mst.harvester.Harvester.harvest(Harvester.java:343)
at xc.mst.harvester.HarvestRunner.runHarvestStep(HarvestRunner.java:227)
at xc.mst.harvester.HarvestRunner.runHarvest(HarvestRunner.java:163)
at xc.mst.scheduling.HarvesterWorkerThread.run(HarvesterWorkerThread.java:57)
26 May 2009 10:04:01,421 ERROR [Thread-18] - An error occurred while
harvesting http://128.151.244.132:8080/OAIToolkit/oai-request.do
xc.mst.harvester.Hexception: The request for data resulted in an invalid
response from the provider. Error: Read timed out
, request was
http://128.151.244.132:8080/OAIToolkit/oai-request.do?
verb=ListRecords&resumptionToken=null%7C10000
at xc.mst.harvester.Harvester.doHarvest(Harvester.java:678)
at xc.mst.harvester.Harvester.harvest(Harvester.java:343)
at xc.mst.harvester.HarvestRunner.runHarvestStep(HarvestRunner.java:227)
at xc.mst.harvester.HarvestRunner.runHarvest(HarvestRunner.java:163)
at xc.mst.scheduling.HarvesterWorkerThread.run(HarvesterWorkerThread.java:57)
It was reported by Sharmila, working on the MST
Original comment by sva...@library.rochester.edu
on 8 Jun 2009 at 2:18
Work on this issue has started. Seems that whenever the mySQL server times out,
which is mentioned in the configuration file of it, the connection dies out.
When someone then tries to harvest, it returns null, as the connection is lost
and
it has not been killed.
The solution I am trying out is to reset the mySQL server when the request
comes,
and if the mySQL server has timed out.
Original comment by sva...@library.rochester.edu
on 31 Jul 2009 at 3:58
The bug has been solved as mentioned the way mentioned in last comment.
Changed Source code has been committed to the SVN repository.
Original comment by sva...@library.rochester.edu
on 4 Aug 2009 at 1:42
Incorporated in the 0.6.1 bug-fixed version.
Original comment by sva...@library.rochester.edu
on 4 Aug 2009 at 9:14
Issue was not solved. My understanding is that the fix was partial. The goal
was to
detect that the connection was timed out, establish a new one and then use the
new
one, and that the last step of using the new connection didn't happen.
Original comment by rc...@library.rochester.edu
on 10 Aug 2009 at 5:27
Now it is included in the 0.6.2 version, so moving it to Done.
Original comment by sva...@library.rochester.edu
on 27 Aug 2009 at 6:56
Original issue reported on code.google.com by
rc...@library.rochester.edu
on 1 Jun 2009 at 3:56