rism-digital / muscat

🗂️ A Rails application for the inventory of handwritten and printed music scores
http://muscat-project.org
34 stars 16 forks source link

import_from_marc.rb should allow more command-line options #1157

Closed fjorba closed 10 months ago

fjorba commented 2 years ago

Currently, import_from_marc.rb only allows three positional arguments: filename, record type (or model) and an optional record offset.

The script would be more useful in two aspects: first, allow classical command-line flags and parameters, and also to allow different record importing scenarios. This would be my draft proposal, that keeps backward compatibility with current syntax and default values:

$ rails runner housekeeping/import/import_from_marc.rb --help
Import marcxml records into Muscat database

Usage:  bin/rails runner housekeeping/import/import_from_marc.rb [options]

Options:
 -f, --file        input file name, required
 -t, --type        record type; choose one of: Source, Publication, Holding, Person, Institution, Work
 -m, --from        first record to import
 -v, --versioning  update records version
 -u, --authorities create (scaffold) related Marc authorities records
 -i, --insert      only add new records, skip duplicates
 -r, --replace     only overwrite existing records, skip new ones
 -a, --append      only append non-existing tags to existing records
 -n, --dry-run     only simulate, do not update database
 -h, --help        this help

This script can also be run with positional arguments:

 bin/rails runner housekeeping/import/import_from_marc.rb filename type [from]

At this moment I already have the parameter parsing mechanism up and running, but I haven't yet implemented the behaviour of the new values. Most of them are inspired in the ones we have in classical Invenio, and we depend on them for our regular importing tasks, as we import a great deal of our records.

I'd appreciate your comments and ideas before opeining the corresponding PR. Thanks

xhero commented 2 years ago

Yes! A proper command line interface would be welcome, I have neglected this script for some time (I really seldom use it) so you can modify it to make it more usable for your data without problem. I think the options look good, you can choose if just to start implementing some of those at the beginning if you want.

fjorba commented 2 years ago

Ok, I'll do it. For the time being, I'm trying to understand what happens when records are imported and the implications with existing and new authority records. For example, it seems to me that the --dry-run option is difficult to implement, as the logic moves far away from the import script.