timathom / marc-schema

JSON description of the MARC Authority, Bibliographic, and Holdings schemata
Apache License 2.0
6 stars 1 forks source link

Automatically execute scraping #3

Closed nichtich closed 11 months ago

nichtich commented 11 months ago

It should be possilble to run the scraping with a command so it can be done regularly without human execution. The CLI interface of Basex should do:

java -cp BaseX.jar org.basex.BaseX

then (this sure can somehow be passed via CLI):

RUN "run-scraper.xq"
RUN "run-parser.xq"

Unfortunately the result is an error:

[file:io-error] Resource "/marc21_json_schema.json (no authorization)" not found.

Removing the / in line 7 of run-parser.xq fixes this.

timathom commented 11 months ago

Thanks, @nichtich! The intent was for the user to supply a directory path on line 6, but it's probably better to remove the slash, so then it works regardless.

timathom commented 11 months ago

Good point--you can run the queries from the BaseX install directory like this:

bin/basex -c "RUN marc-schema/run-scraper.xq; RUN marc-schema/run-parser.xq"

In theory, it should also be possible to make the output directory variable ($ms:DIR) external so that its value can be passed on the command line, but I haven't gotten that to work yet.

timathom commented 11 months ago

Okay, this syntax works:

bin/basex -Q marc-schema/run-scraper.xq -b ms:DIR="/Users/Abc/Desktop/" -Q marc-schema/run-parser.xq