mnylc / islandora_multi_importer

This is a flexible, twig based, all cmodel, tabular data to islandora Object importer with optional ZeroMQ processing
GNU General Public License v3.0
16 stars 15 forks source link

Add dry run and selective row processing #15

Open DiegoPino opened 7 years ago

DiegoPino commented 7 years ago

"dry run" button before ingesting to test all rows on an Spreadsheet, including some type of "is referenced data stream present?"

McFateM commented 7 years ago

An additional/similar feature might be introduction of a "limit" parameter like that found in the Move to Islandora Kit (MIK). It limits processing to a specified number of records from the CSV. Great for testing and learning.

DiegoPino commented 7 years ago

Cool. Updating the label of this then

McFateM commented 7 years ago

I have partially addressed this issue by adding code in utilities.inc that skips rows of data which begin with a hashtag (#) character in the first column. Changes were committed in https://github.com/Islandora-Collaboration-Group/islandora_multi_importer/commit/e8a03836d9ff582cfe3f068a0c88de38363b3c3c.

McFateM commented 6 years ago

In the next PR I will have removed the hashtag (#) 'skip' feature mentioned above.

Instead, we will modify our workflow to leverage Google Sheets and the 'range' specification that is provided, coupled with the practice of importing from different sheets/copies of the data to easily control which records are imported at any point in time.

I hope to document this workflow soon and will try to clearly layout how we expect this to work for us.

McFateM commented 5 years ago

Can't believe I let this issue sit for so long. 8^(

We had success with the hashtag (#) 'skip' feature and used it quite a bit before I turned it off. Now I lament having done so, because the range specification option, while working properly, really doesn't do all that I'd like because any range you specify still must have a valid header row at the top of that range, and that messes up row numbers so we have to be extra careful. I really would prefer restoring the hashtag feature so that row numbers are less likely to be altered.

I'm going to update my fork of the project, add the hashtag 'skip' code back in, and give it a test in my disposable ISLE instance.