IHTSDO / snowstorm

Scalable SNOMED CT Terminology Server using Elasticsearch
Other
208 stars 83 forks source link

Exit process after import completes #18

Closed nhnicwaller closed 5 years ago

nhnicwaller commented 5 years ago

Please consider adding a command line option that causes snowstorm to exit gracefully after finishing an import.

Scenario

I'm running Snowstorm in two ways.

1) I have snowstorm running as a long-lived, supervised process (webserver mode) that serves responses to client requests. 2) I have a short-lived snowstorm process that starts up when I need to [re]import the Snomed CT concept database into Elasticsearch. That process runs with options --delete-indices and --import [file]. I would like that version to exit after the import has successfully completed in order to free up system memory, but currently it just continues running indefinitely.

Workaround

Currently my workaround is to set a timeout on the import process so that it is killed after a number of hours, but that's less optimal compared to snowstorm exiting gracefully as soon as the work is done. It means I'm using memory longer than I need to, and it means there's a slight risk the process could be killed before import completes.

Suggestion

I've seen other software use options like --once or --exit. Maybe one of those would fit here?

kaicode commented 5 years ago

Thanks for describing your use case, I always like to hear how people are using Snowstorm. Nice idea. I think the --exit flag fits as you suggest. It's a quick one, I will add that to develop today.

kaicode commented 5 years ago

@nhnicwaller Have you considered importing the RF2 Delta of the new SNOMED CT release using the REST interface? This delete-indices and import flag on the command line are like a quickstart to get the content in the first time. After that the REST API can be used to import new releases of SNOMED CT on top of the original import. Only the new content needs to be imported using the DELTA import type which speeds things up. There is no downtime with this option. The import happens in a single transaction.

nhnicwaller commented 5 years ago

@kaicode Awesome, thanks for adding this right away!

Yes, I was originally experimenting with the REST interface for imports. If I remember correctly, I ran into a problem where the API call to list active imports didn't work for me. Since I was unable to monitor the status of an import, I decided to spawn a new process instead and that has worked out pretty well so far. I might give the REST interface another try.

I think this can be closed now.

kaicode commented 5 years ago

@nhnicwaller when you make the POST to create the RF2 import there will be a header called Location in the response with the URL of the new import. Use this URL and add /archive to upload the RF2 zip file. Then the URL of the import can be used to monitor the status of the process.

kaicode commented 5 years ago

The --exit flag is included in release 2.2.0.