ajs6f / fcrepo3-rdf-extractor

A utility to extract RDF triples from Fedora Commons 3 Akubra-based persistence stores.
Other
0 stars 2 forks source link

Question: extractor stops outputting but doesn't quit. #8

Open whikloj opened 6 years ago

whikloj commented 6 years ago

This is weird. I started the tool running against my production objectStore and it was kicking along all nice and easy.

Once it had filled 5 20MB log files and 20GB of triples it just stopped outputting anything.

The process is still active (in a screened session), but it is not doing anything I can see.

Could it be that my logback.xml is causing this? It is weird that I set the RollingFIleAppender to <maxIndex>5</maxIndex> and thats when the process stops.

I tried deleting the log files, but that had no effect.

Thoughts?

whikloj commented 6 years ago

I bumped the logback to 6 files and it seemed to run a bit further but then I removed the --logback option and it ran but stopped again. Not sure what to say, I have 4 * 5.2GB files of quads but the application did not exit so I am pretty sure I am not done.

ajs6f commented 6 years ago

Hey, sorry I've been neglectful of this. Quick question: when you write "20GB of triples" and write "4 * 5.2GB files of quads", are these actually the same amount of output, in the same form, or no?

whikloj commented 6 years ago

No worries. The first time I recall the 4 files were about 5.0-5.1GB each and the second time they were 5.2GB each for sure. So I would say not the same amount of output, but ls -lh is not really a scientific comparison.

Is there something I can do to see what it thinks its doing when it gets like this?

whikloj commented 6 years ago

Maybe its my use of screen, you mentioned how you will background a process before. How was that?

ajs6f commented 6 years ago

I would just use nohup java -jar blah blah &, redirect SYS[OUT|ERR] to taste.

whikloj commented 6 years ago

Ok I just started it again with the command

java -jar /opt/fcrepo3-rdf-extractor/target/fcrepo3-rdf-extractor-0.0.1-SNAPSHOT.jar -a /usr/local/fedora/server/config/spring/akubra-llstore.xml -g 'ca.umanitoba.dam.fedora#ri' -o /var/indexes/triples --skipEmptyLiterals > /home/u5/whikloj/rdf-extractor.log 2&>1

We'll see what happens.

whikloj commented 6 years ago

Sorry got busy and left this till now.

Apparently it ran for 2.5 hours and then froze. The process is still there, no exception in the log...just hung.

The log file

[whikloj@jujo]~% tail -f rdf-extractor.log 
INFO 16:57:07.023 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2128199
INFO 16:57:07.042 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:1388187
INFO 16:57:07.043 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2900096
INFO 16:57:07.047 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2864682
INFO 16:57:07.061 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2386909
INFO 16:57:07.066 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2076462
INFO 16:57:07.068 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2916221
INFO 16:57:07.076 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:1896895
INFO 16:57:07.086 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2847973
INFO 16:57:07.097 (edu.si.fcrepo.ObjectProcessor) Operating on object URI: info:fedora/uofm:2644389

The triple directory.

[whikloj@jujo]~% ll -h /var/indexes/triples
total 20G
-rw-r--r-- 1 whikloj games 5.0G Jun 12 17:57 quads0.nq
-rw-r--r-- 1 whikloj games 5.0G Jun 12 17:57 quads1.nq
-rw-r--r-- 1 whikloj games 5.0G Jun 12 17:57 quads2.nq
-rw-r--r-- 1 whikloj games 5.0G Jun 12 17:57 quads3.nq

All the threads appear to be sleeping in htop.

Not sure what to do next.