sgsinclair / Voyant

GNU General Public License v3.0
208 stars 53 forks source link

Automatically loading corpora from the command line or during build #438

Closed senderle closed 5 years ago

senderle commented 5 years ago

I'm trying to create a docker image / compose file that automatically sets up a headless Voyant server with custom corpora pre-loaded. I've got the image working nicely, sitting behind a SSL-enabled reverse proxy (using Caddy). But I haven't been able to find an obvious way to load corpora without going through the web interface.

I imagine I'm missing something -- apologies in that case! But after looking at a lot of documentation (and source code!) I am stuck. Is it possible to do this? If so, how? If it's easy, could it be added to the documentation?

sgsinclair commented 5 years ago

It is indeed possible (though not obvious) to load a corpus without the UI but I wonder if you couldn't make a local call to do the same once the server is running, something in a shell script like

curl "https://localhost/?input=url"

where URL could be a file copied beforehand to somewhere where the local server could access it (such as somewhere in the resources folder). See what I mean?

senderle commented 5 years ago

Ah yes, I think see. So could I just pass a standard URI for a local file?

It would also be easy enough to configure a static folder containing zip archives.

sgsinclair commented 5 years ago

Yes, though the running server expects a URL not a local file path. Just so you have options, you could load a corpus BEFORE launching the server with something like this, run from the working directory of VoyantServer.jar

java -Xmx4g -classpath "./_app/WEB-INF/classes:./_app/WEB-INF/lib/*" org.voyanttools.trombone.Controller storage=file file=LOCALFILEPATHOFSOURCEHERE tool=corpus.CorpusMetadata luceneIndexingTimeout=3600

The last option should only be necessary for very large corpora. The first memory setting may not be necessary for reasonably sized corpora.

senderle commented 5 years ago

That all makes sense, and it seems reasonable to close now. Thanks for the help!