eXtensibleCatalog / OAI-Toolkit

Synchronize MARC metadata in most ILSs with XC
5 stars 1 forks source link

Error when run convertload.sh #112

Closed patrickzurek closed 8 years ago

patrickzurek commented 8 years ago

JIRA issue created by: rcook Originally opened: 2012-06-26 09:27 AM

Issue body: GC issue http://code.google.com/p/xcoaitoolkit/issues/detail?id=89 Note that there are many attachments in the GC issue.

Reported by atkins245, Jun 8, 2012

What steps will reproduce the problem?

  1. Following the installation from xc-OAIToolkit
  2. Copy any.mrc to marc folder
  3. Run convertload.sh

What is the expected output? What do you see instead? Hang during the convertload

What version of the product are you using? On what operating system? OAIToolkit 1.08 Linux - openSUSE 12.1 64bit Java - Sun JDK1.6_u32 Tomcat 6.33 mySQL 5.3.23

Please provide any additional information below. I can go to http://localhost:8080/OAIToolkit/index.jsp and then click OAI Sample Request/.... I saw the OAI page but at OAI Identify settings ([tomcat]/bin/OAIToolkit.server.properties) - I saw it not copy from the [tomcat]/bin/OAIToolkit.server.properties.

I coppied all error log here :-)

librarian_convert.log

7.1 KB View Download
librarian_load.log 13.8 KB View Download
lucene_dbStatistics.log 0 bytes Download
programmer.log 94.7 KB View Download
toolkit.log 126 KB View Download
Delete comment Comment 1 by project member cedelis@uillinois.edu, Jun 11, 2012

Hello! It appears that you may be running into two different issues: one with the convertload.sh script and the other with getting the OAI server up and running correctly (with the correct parameters). Let's try to figure out the first issue, if we can, first. I notice that the logs end abruptly (with no reasonable explanation as to 'why'). In my experience, I've been able to obtain more information by running these convert/load scripts via 'nohup', as it tends to log additional information when/if a fatal exception were to occur. Would you be willing to re-run your script as follows?

nohup ./convertload.sh &

Then, if you encounter the same issue, there may be potentially more (useful) error reporting in the nohup.out file (which wasn't reported in the toolkit's error logs.

Delete comment Comment 2 by atkins245, Jun 11, 2012

I followed your instruction and below is the screen capture:

library-test:/OAIToolkit # nohup ./convertload.sh & [1] 6181 library-test:/OAIToolkit # nohup: ignoring input and appending output to `nohup.out' []

It still hang. For about 2 min, I saw a marc file from marc folder move to marc_dest folder. All log files are not updated.

Delete comment Comment 3 by project member cedelis@uillinois.edu, Jun 11, 2012

Can you send me the nohup.out log file? Perhaps it reveals something (maybe an out-of-memory java error, or something like that).. Chris

Delete comment Comment 4 by atkins245, Jun 11, 2012

Hi Chris, The problem is I don't see the nohup.out file in /OAIToolkit/log Do you know where the file locate? I attach the logs in /OAIToolkit/log

librarian_convert.log

4.4 KB View Download
librarian_load.log 7.2 KB View Download
lucene_dbStatistics.log 0 bytes Download
programmer.log 50.6 KB View Download
toolkit.log 68.2 KB View Download
Delete comment Comment 5 by atkins245, Jun 11, 2012

Hi, Thanks for your information. Here is my nohup.out

nohup.out

79.7 KB View Download
Delete comment Comment 6 by project member cedelis@uillinois.edu, Jun 11, 2012

At the end of the nohup.out log, you will notice an OutOfMemory Error (see below). That means you need to add more RAM to the java VM. You can add more RAM by editing convertload.sh and adding the following -Xmx parameter:

-Xmx4g

so that it looks something like:

java -d64 -Xmx4g -cp "lib/xercesImpl.jar:lib/xml-apis.jar" -jar lib/OAIToolkit-1.0.8.jar [etc.]

This assumes, of course, you have enough RAM on your server. Try adding setting this parameter and re-running your script.

Chris

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) at java.lang.StringCoding.encode(StringCoding.java:272) at java.lang.String.getBytes(String.java:946) at info.extensiblecatalog.OAIToolkit.utils.XMLValidator.validate(XMLValidator.java:106) at info.extensiblecatalog.OAIToolkit.importer.importers.LuceneImporter.importRecord(LuceneImporter.java:127) at info.extensiblecatalog.OAIToolkit.api.Importer.load(Importer.java:690) at info.extensiblecatalog.OAIToolkit.api.Importer.execute(Importer.java:162) at info.extensiblecatalog.OAIToolkit.api.Importer.main(Importer.java:1024)

Delete comment Comment 7 by atkins245, Jun 12, 2012

Hi, it took over 3 hours to run with almost 100% CPU, 100% memory and 50% swap memory - wow. Here is the nohup file. Now, how can I fix for web access?

nohup.out.zip

1.9 MB Download
Delete comment Comment 8 by project member cedelis@uillinois.edu, Jun 12, 2012

Something went wrong in the load process (the convert process seemed OK):

2012-06-11 16:07:21,526 main INFO - [PRG] Storage type: Lucene 2012-06-11 16:07:21,542 main INFO - [PRG] LuceneIndex: lucene_index 2012-06-11 16:07:22,636 main ERROR - [PRG] The record hasn't got identifier (field 001) 2012-06-11 16:07:22,648 main INFO - [PRG] XML error file is: /OAIToolkit/error_xml/error_records_in_bibs1_30000.xml 2012-06-11 16:07:22,663 main INFO - [LIB] XML error file is: /OAIToolkit/error_xml/error_records_in_bibs1_30000.xml 2012-06-11 16:07:22,927 main ERROR - [PRG] The record hasn't got identifier (field 001) 2012-06-11 16:07:22,934 main ERROR - [PRG] The record hasn't got identifier (field 001) 2012-06-11 16:07:22,961 main ERROR - [PRG] The record hasn't got identifier (field 001) 2012-06-11 16:07:23,062 main ERROR - [PRG] The record hasn't got identifier (field 001) .org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/OAIToolkit/lucene_index/write.lock

This would explain some of the awful resource usage (btw, what hardware are you using? CPU/RAM?). It appears that you re-ran the convertload.sh process on a locked Lucene index. The toolkit should probably behave better during such a scenario (quit right away, instead of trying to continue). In order to solve your issue, you would need to either start over:

http://code.google.com/p/xcoaitoolkit/wiki/StartingOver

Or, you can remove the lock file, and re-run your convertload.sh script (it will pick up where it last left off):

rm /OAIToolkit/lucene_index/write.lock ./convertload.sh

If I were you, I'd probably just start over from scratch. Another useful thing to do, the next time you run convertload.sh, is to look at the nohup.out log while it is processing (or check up on it every once in a while):

tail -f nohup.out

In your scenario above, you would've noticed that it was complaining over and over about not being able to open the Lucene index for writing (locked), and you could've killed the process much earlier on.

Chris

Delete comment Comment 9 by project member cedelis@uillinois.edu, Jun 12, 2012

Also: do most of your MARC records have 001 identifiers? Because the OAI Toolkit currently only supports MARC records with a 001 identifier. Chris

Delete comment Comment 10 by atkins245, Jun 12, 2012

Hi Chris, I StartOver with only sample as installation session. All the logs are OK (ConvertLoad is success). Here are the error I got when I go to http://localhost:8080/OAIToolkit/ then I click OAISample/... Link then click ListRecords

I attach all OAIToolkit log as well as Tomcat log too.

Thanks

HTTP Status 500 -

type Exception report

message

description The server encountered an internal error () that prevented it from fulfilling this request.

exception

javax.servlet.ServletException: Servlet execution threw an exception

root cause

java.lang.ExceptionInInitializerError info.extensiblecatalog.OAIToolkit.struts.action.OaiRequestAction.execute(OaiRequestAction.java:58) org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:419) org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:224) org.apache.struts.action.ActionServlet.process(ActionServlet.java:1194) org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414) javax.servlet.http.HttpServlet.service(HttpServlet.java:617) javax.servlet.http.HttpServlet.service(HttpServlet.java:717)

root cause

java.lang.NullPointerException info.extensiblecatalog.OAIToolkit.api.Facade.(Facade.java:159) info.extensiblecatalog.OAIToolkit.struts.action.OaiRequestAction.execute(OaiRequestAction.java:58) org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:419) org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:224) org.apache.struts.action.ActionServlet.process(ActionServlet.java:1194) org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414) javax.servlet.http.HttpServlet.service(HttpServlet.java:617) javax.servlet.http.HttpServlet.service(HttpServlet.java:717)

note The full stack trace of the root cause is available in the Apache Tomcat/6.0.33 logs.

librarian_convert.log

551 bytes View Download
librarian_load.log 824 bytes View Download
programmer.log 6.0 KB View Download
toolkit.log 8.3 KB View Download
catalina.2012-06-12.log 202 KB View Download
catalina.out 2.4 MB View Download
host-manager.2012-06-12.log 0 bytes Download
localhost.2012-06-12.log 29.6 KB View Download
manager.2012-06-12.log 1015 bytes View Download
Delete comment Comment 11 by atkins245, Jun 12, 2012

Now, I will convertload 250,000 records to see how it work

Delete comment Comment 12 by atkins245, Jun 12, 2012

ConvertLoad for 250,000 records is working now. However, I found about 150 records does not have 001. For second error: can't not display list record (Error http 500): I found at /var/log/tomcat6/catalina.out file an error as below: basePropertiesFileName: /srv/tomcat6/bin/OAIToolkit.directory.properties java.lang.Exception: Inexistent configuration file: /srv/tomcat6/bin/OAIToolkit.directory.properties

I know I set up CATALINA_HOME is /usr/share/tomcat6 so I don't know how basePropertiesFileName get information from?

I copied all 5 OAIxxxx properties files from /usr/share/tomcat6/bin to /srv/tomcat6/bin/ and restart server and now I found other error:

INFO: Deploying web application archive OAIToolkit.war log4j:WARN No appenders could be found for logger (org.apache.struts.util.PropertyMessageResources). log4j:WARN Please initialize the log4j system properly. ApplInfo::init(/srv/tomcat6/bin) ApplInfo::init(OAIToolkit) reading application properties basePropertiesFileName: /srv/tomcat6/bin/OAIToolkit.directory.properties java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:883) at info.extensiblecatalog.OAIToolkit.utils.ConfigUtil.load(ConfigUtil.java:60) at info.extensiblecatalog.OAIToolkit.utils.ApplInfo.initApplication(ApplInfo.java:250) at info.extensiblecatalog.OAIToolkit.utils.ApplInfo.init(ApplInfo.java:153) at info.extensiblecatalog.OAIToolkit.struts.InitializerServlet.init(InitializerServlet.java:104) at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1173) at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:993) at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4421) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4734) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:799) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:601) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:943) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:778) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:504) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1317) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:324) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1065) at org.apache.catalina.core.StandardHost.start(StandardHost.java:840) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1057) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463) at org.apache.catalina.core.StandardService.start(StandardService.java:525) at org.apache.catalina.core.StandardServer.start(StandardServer.java:754) at org.apache.catalina.startup.Catalina.start(Catalina.java:595) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) Jun 12, 2012 4:40:03 PM org.apache.catalina.startup.HostConfig deployDirectory INFO: Deploying web application directory ROOT Jun 12, 2012 4:40:03 PM org.apache.catalina.startup.HostConfig deployDirectory INFO: Deploying web application directory sample Jun 12, 2012 4:40:03 PM org.apache.catalina.startup.HostConfig deployDirectory INFO: Deploying web application directory examples Jun 12, 2012 4:40:03 PM org.apache.coyote.http11.Http11AprProtocol start INFO: Starting Coyote HTTP/1.1 on http-8080 Jun 12, 2012 4:40:03 PM org.apache.coyote.ajp.AjpAprProtocol start INFO: Starting Coyote AJP/1.3 on ajp-8009 Jun 12, 2012 4:40:03 PM org.apache.catalina.startup.Catalina start

Delete comment Comment 13 by project member cedelis@uillinois.edu, Yesterday (17 hours ago)

Are you sure that the tomcat server has read and write privileges for the *.properties files?

Delete comment Comment 14 by atkins245, Yesterday (17 hours ago)

All OAItoolkit*.properties (total 5) in /usr/share/tomcat6/bin are own by root. I found at /var/log/tomcat6/catalina.out file an error as below: basePropertiesFileName: /srv/tomcat6/bin/OAIToolkit.directory.properties java.lang.Exception: Inexistent configuration file: /srv/tomcat6/bin/OAIToolkit.directory.properties

I know I set up CATALINA_HOME is /usr/share/tomcat6 so I don't know how basePropertiesFileName get information: /srv/tomcat6/bin/OAIToolkit.directory.properties?