SemanticComputing / fuseki-docker

Apache Jena Fuseki with SeCo extensions
MIT License
33 stars 15 forks source link

Manual reindexing inside a container #23

Closed rjalexa closed 1 year ago

rjalexa commented 1 year ago

Dear friends, grateful user of your image :) I am now needing to reindex my dataset. Tried: java -cp fuseki-server.jar jena.textindexer --desc=/fuseki-base/configuration/mema.ttl -v from within the running docker container, but of course I get a lock denied since the running process already has locked.

How can I acchieve this?

I cannot just stop the container and run this on the host since I transitioned to a docker volume and therefore data is not shared on the host filesystem.

Have a good day. Thank you

rjalexa commented 1 year ago

Nevermind I did find a solution. Leaving it here in case it will be useful to anyone in the future.

Problem: I did not have a proper DATASETNAME.ttl file in the /fuseki-base/configuration directory and had inserted 16M triples there. So I needed a way fo reindexing after having fixed the Lucene part of the ttl file.

Solution: cloned the original project, modified the Dockerfile so that it did not RUN the last command and instead a simple tail -f /dev/null to keep the container up. Rebuilt the image; added this new image to a docker-compose.yml in which I added the docker volume used by the regular container.

So now stopping the "production" regular container, starting this new one, I could finally run: java -cp fuseki-server.jar jena.textindexer --desc=../fuseki-base/configuration/DATASETNAME (where DATASETNAME of course need to be substitued with the real one). Works well and in under a minute my 16M triples are indexed a expected.

Now stop this "indexing container" and restart the production one.

HTH. Take care