mnylc / islandora_multi_importer

This is a flexible, twig based, all cmodel, tabular data to islandora Object importer with optional ZeroMQ processing
GNU General Public License v3.0
16 stars 15 forks source link

New "update" functionality: help wanted in defining expected behaviour #37

Open DiegoPino opened 7 years ago

DiegoPino commented 7 years ago

Trying to get some feedback on a few implementation details on the new "update existing objects" functionality. Thanks to this branch I did :https://github.com/mnylc/islandora_batch/tree/7.x-ISLANDORA-2046 We can reuse the same batch queue, processing logic, etc for doing other cool things than just ingesting.

So, right now my approach is, if "update existing objects" is selected

GLOSSARY: SOURCE OBJECT === IslandoraBatchObject derived object (not in repository) TARGET OBJECT ==== Existing Fedora object with same PID as SOURCE OBJECT

Open questions are:

  1. What happens if SOURCE OBJECT'S CMODEL is not the same as TARGET OBJECT's CMODEL?
  2. Is updating the data streams content enough? Should maybe dropping the data stream completely and recreating to avoid many versions be possible (dangerous)?
  3. What should be the default action for empty/null valued SOURCE attributes and contents? Should those be ignored or copied over? When /how to decide?
  4. What RELS-EXT properties should be copied/updated? What happens if Source CSV assumes a different hierarchy than the existing one? Other PARENT OBJECT, do we keep existing and add the new ones or do we replace with new?)
  5. Should basically replacing TARGET OBJECT with SOURCE OBJECT be another option? Drop TARGET, REINGEST using SOURCE which will just maintain the PID.
  6. Any other options I should be assuming?

@Natkeeran @kimpham54 @McFateM

kimpham54 commented 7 years ago

What happens if SOURCE OBJECT'S CMODEL is not the same as TARGET OBJECT's CMODEL?

Throw an error - I'd avoid updating altogether. If this scenario arises it is likely due to human error, with the user specifying the wrong cmodel for the source in the spreadsheet. I don't think there would ever be a situation where someone would want to change CModels for an object, nor should we allow that to happen.

Is updating the data streams content enough? Should maybe dropping the data stream completely and recreating to avoid many versions be possible (dangerous)?

If Updating means replacing the entire contents of a datastream but retaining the versions, then that is the better option.

What should be the default action for empty/null valued SOURCE attributes and contents? Should those be ignored or copied over? When /how to decide?

For contents of datastreams I'm inclined to ignore! I definitely don't think they should override datastreams that may exist, I can see a use case to keep original datastreams. The ability to override to create empty contents is not a use case that I am concerned about at the moment.

What RELS-EXT properties should be copied/updated? What happens if Source CSV assumes a different hierarchy than the existing one? Other PARENT OBJECT, do we keep existing and add the new ones or do we replace with new?)

I would be concerned about the flexibility and amount of control someone may have in editing many of the RELS-EXT properties this way. isMemberOf, isMemberOfCollection are sufficient for our use cases. I could potentially also see RELS-EXT properties for books being valuable, for people that want to add or rearrange pages.

Should basically replacing TARGET OBJECT with SOURCE OBJECT be another option? Drop TARGET, REINGEST using SOURCE which will just maintain the PID.

This is similar to the first scenario, except in this case I think it is more obvious that the user's intention is to replace an object completely and reuse the existing PID. I can see this as a possible desired feature, especially if the PID is somehow meaningful but someone wishes to rebuild the object completely.

Any other options I should be assuming?

These are great questions, I will definitely try to consider other potential issues.

Natkeeran commented 7 years ago

@DiegoPino

Some quick thoughts about this:

  1. If a new datastream is found (due to SOURCE CMODEL change), then create the datastream (if the user chooses to do so), if a datastream has been removed, don't modify the datastream (in the future, optionally delete). (I assume you are able to get a lot of the fedora properties from datastream file itself).

    Also, I am not sure how you would recognize change in SOURCE CMODEL? That info is from the fedora server, right?

  2. Default behaviour should be to update, keeping the versions. Some users may want to drop and replace (i.e large OBJs)

  3. I assume here that you are referring to attributes relating to a data stream!. If the user chooses to update a particular datastream, we have to assume that they have the authoritative version in the SOURCE.

  4. I don't fully understand how RELS-EXT gets populated right now by the multi-importer. Assuming it is based on the csv data (ex parent) and CMODEL, I think we have to assume that the csv will contain all the information needed to fully recreate the RELS-EXT. Thus, Source CSV will be the authoritative. This restricts the user somewhat, but considering RELS-EXT update possibilities can be a very difficult task.

  5. This assumes that all datastreams will be replaced, not always the use case. If replace option is provided for num.2 then, in effect this is available. Some users may prefer this so they can have one version + derivatives. As you noted, bit risky as you will loose all the audit trail etc and may not pass the preservation policies of organizations.

  6. Derivative re generation is not clear to me. If OBJ is getting replaced, would there be an option (as there is now in ingest) to generate the derivatives from SOURCE?.

(We have not discussed this locally here. So you will get different approaches/responses.)

DiegoPino commented 7 years ago

Ok, i'm giving this a try now. Thanks to both for your input. Probably the future will require more than just update and even some mixed CRUD (per data stream?) could be set. I don't want to overcomplicate this for now. @Natkeeran will answer some of your points later, @kimpham54 i agree on 95% with you. the 5% difference is just my weird need of giving people weapons of meta data mass destruction by allowing CMODEL transmutation!... 😁

McFateM commented 7 years ago

I read Kim’s response and I’m in agreement with all of it. I don’t ever anticipate wanting to change an object’s CMODEL, and update is all I need, not drop and replace (although if that’s necessary I don’t object).

For what it is worth, I anticipate two likely/popular use cases at Grinnell…

1) Somebody introduced errors in multiple MODS records during a CSV import. Easier to fix them in the CSV and re-import than it is to visit them one-at-a-time. In this case I would ideally just like to supply the new MODS data, no other datastreams, and have the IMI replace just the MODS datastream and regenerate derivatives as necessary.

2) I’d love to see IMI made useable for repository migration and the ability to reproduce an object from CSV exactly like the original, complete with the same PID, is a key component of that use case.

Thanks all. Take care.

-Mark M.

From: Diego Pino Navarro notifications@github.com<mailto:notifications@github.com> Reply-To: mnylc/islandora_multi_importer reply@reply.github.com<mailto:reply@reply.github.com> Date: Wednesday, August 23, 2017 at 1:56 PM To: mnylc/islandora_multi_importer islandora_multi_importer@noreply.github.com<mailto:islandora_multi_importer@noreply.github.com> Cc: Mark McFate mcfatem@grinnell.edu<mailto:mcfatem@grinnell.edu>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [mnylc/islandora_multi_importer] New "update" functionality: help wanted in defining expected behaviour (#37)

Ok, i'm giving this a try now. Thanks to both for your input. Probably the future will require more than just update and even some mixed CRUD (per data stream?) could be set. I don't want to overcomplicate this for now. @Natkeeranhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_natkeeran&d=DwMFaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=YIUwjsBnKEPBk5Uv6egKhdaICsHal9_CnHbj37izepc&s=Ryue5N8dLJCIGMkWPjpUuZiXvc2qGgIYpzPfTHXB-2s&e= will answer some of your points later, @kimpham54https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_kimpham54&d=DwMFaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=YIUwjsBnKEPBk5Uv6egKhdaICsHal9_CnHbj37izepc&s=axHZOO_wGSLIT-i4O2dpxmwkb5O-XZQ0DFbttrUy0tI&e= i agree on 95% with you. the 5% difference is just my weird need of giving people weapons of meta data mass destruction by allowing CMODEL transmutation!... 😁

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mnylc_islandora-5Fmulti-5Fimporter_issues_37-23issuecomment-2D324430348&d=DwMFaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=YIUwjsBnKEPBk5Uv6egKhdaICsHal9_CnHbj37izepc&s=Fzd_cbwHTTktHO2Jd_Ow1tUJdRtQFhZQf44kCOLa_xc&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AIFIwfZKMsQnLSrDPZ8rmnmhISI2yYViks5sbHXhgaJpZM4PAXRC&d=DwMFaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=YIUwjsBnKEPBk5Uv6egKhdaICsHal9_CnHbj37izepc&s=-bamIdgApZousfshOVPTiSmtjhuDHqvVZxNQqTzQbbo&e=.

DiegoPino commented 7 years ago

Thanks, folks. So, first take on this here: https://github.com/mnylc/islandora_multi_importer/tree/googleAPI-Update-WIP

The logic that allows updates to happen is in place. As this module advances, more and more user input sanitization and validation is needed, so this is taking extra time. This branch is extremely beta and still does not exposes to the UI the update option. If you want to test it as is, then a new preprocessor parameter needs to be "fixed" or passed on ingest (multi step form ingest)

See https://github.com/mnylc/islandora_multi_importer/commit/1f502f92eef918367bd8594752eefed3790033fd#diff-d34e58beef14ce9ff000b83cbec66205R538 https://github.com/mnylc/islandora_multi_importer/commit/1f502f92eef918367bd8594752eefed3790033fd#diff-d34e58beef14ce9ff000b83cbec66205R559 etc. Many places

If you already know the code base then testing without waiting for me should be easy. If not, i will add that piece later tonight. Need to add a lot of form element conditionals (like why would i display "let islandora generate a PID!" if i'm updating existing ones?

Lastly. This depends heavily on https://github.com/mnylc/islandora_batch/tree/7.x-ISLANDORA-2046 That branch won´t change anything on your repository or other batch modules, not even custom ones, but it allows to avoid ingesting the fake batch objects if the class that is extending islandoraBatchObject does not need that. (our case)

I burned a few neurons and glias here, so assume this code can have issues and will make you miserable.

kimpham54 commented 7 years ago

@DiegoPino I would love to test https://github.com/mnylc/islandora_multi_importer/tree/googleAPI-Update-WIP, but I'm not sure how to fix preprocessorParameters. Should I just hold on or is it a relatively simple fix that I could easily do?

DiegoPino commented 7 years ago

@kimpham54 it is very easy, it is just passing that data into the form submit handler into the parameters array, something like here will work https://github.com/mnylc/islandora_multi_importer/blob/googleAPI-Update-WIP/includes/import.form.inc#L1156

but since a week has passed (sorry, i´m so slow) I have the fully working implementation ready in hand 😄 I'm today fully committed to other tasks -islandora also!- so will pull tomorrow AM first hour.

kimpham54 commented 7 years ago

@DiegoPino I've booked off time to test this tomorrow 👍

DiegoPino commented 7 years ago

Gracias @kimpham54 !!

kimpham54 commented 7 years ago

no, thank you @DiegoPino. As soon as you pull please do not hesitate to assign to me.

DiegoPino commented 7 years ago

@kimpham54 will be doing at 14:00 EST time, got stuck in meetings. Will mention you in the pull =)

DiegoPino commented 7 years ago

@kimpham54 sorry. Found some issues. Just pulled some new code and enabled a few options there. I 'm seeing some behavior things/issues I need to fix via the UI. Probably worth chatting about it via IRC tomorrow morning.
See https://github.com/mnylc/islandora_multi_importer/compare/googleAPI-WIP...googleAPI-Update-WIP?expand=1 (to see all the changes) before testing.

basically, i need an option that allows the ingester to no copy a new generated datastream over to an existing object in the CMODEL mapping menu.

kimpham54 commented 7 years ago

@DiegoPino no problem, i'll be on IRC

McFateM commented 7 years ago

Good morning. Sorry, I’ve got a full schedule today on campus so I can’t do IRC effectively, but thought I would comment.

I’m not familiar with all the new development here, but in the case of the ICG CSV tool that was under development last year we simply adopted the convention that CSV cells which begin with a hashtag (#) are treated as “comments” (and taking it further, if the first cell in a record begins with a hashtag the entire record is treated as a comment). In this scheme if I had an ‘OBJ’ column to pull my OBJ content from I could put a hashtag in the data before the OBJ filename and the import, or update, process would ignore it and not ingest a new OBJ datastream for a particular object. Again, not sure if this might apply here but wanted to be sure and share just in case. Take care.

-Mark M.

From: kim pham notifications@github.com<mailto:notifications@github.com> Reply-To: mnylc/islandora_multi_importer reply@reply.github.com<mailto:reply@reply.github.com> Date: Friday, September 1, 2017 at 6:56 AM To: mnylc/islandora_multi_importer islandora_multi_importer@noreply.github.com<mailto:islandora_multi_importer@noreply.github.com> Cc: Mark McFate mcfatem@grinnell.edu<mailto:mcfatem@grinnell.edu>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [mnylc/islandora_multi_importer] New "update" functionality: help wanted in defining expected behaviour (#37)

@DiegoPinohttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_diegopino&d=DwMCaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=COQ6rGhxEXgqWoXfuiBIElSh8rzkDHK3OlvJgclgw-M&s=eWhyOdENEIUQS3zAn-UY5IikY4SxgWCOQoTD-AbXHwg&e= no problem, i'll be on IRC

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mnylc_islandora-5Fmulti-5Fimporter_issues_37-23issuecomment-2D326562046&d=DwMCaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=COQ6rGhxEXgqWoXfuiBIElSh8rzkDHK3OlvJgclgw-M&s=0xrWQqQzhdK66q7IF-vSkjXvCo5q0UBtvQ1c5kP8BPg&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AIFIwSjkkMl5bsvGUQTtqdaUSCuHmFp-5Fks5sd-5FD-2DgaJpZM4PAXRC&d=DwMCaQ&c=HUrdOLg_tCr0UMeDjWLBOM9lLDRpsndbROGxEKQRFzk&r=PQglHQe-EzyZqJOuOVcmU0OZ6bg-89msSPuqyNlQr28&m=COQ6rGhxEXgqWoXfuiBIElSh8rzkDHK3OlvJgclgw-M&s=whSVJdV0XHAr6cAPHvOWjo-eXRqHJmx2vaqjUGH-3qY&e=.

kimpham54 commented 7 years ago

morning @DiegoPino! I tried testing this last week and couldn't get past the zip upload stage, it would return me to the import spreadsheet screen. Is there something I can/should do to test it?

Natkeeran commented 7 years ago

@DiegoPino I tested this as well.

I have a slightly different issue. I go through all the steps and in the last step, the screen blanks out. I don't see any errors in the drupal logs.

PHP debug:

 /var/www/drupal/sites/all/modules/islandora_multi_importer/includes/import.form.inc, line 1195   

Apache Error log:

[Tue Sep 12 14:04:05.652124 2017] [:error] [pid 1980] [client 10.0.2.2:57816] running getIngestInfo, referer: http://localhost:8000/multi_importer
[Tue Sep 12 14:04:05.689294 2017] [:error] [pid 1980] [client 10.0.2.2:57816] PHP Fatal error:  Call to undefined function addrelationshipsforupdate() in /var/www/drupal/sites/all/modules/i$
DiegoPino commented 7 years ago

@kimpham54 I will retake this today late. Release stuff and local duties took over my life. Do you get any logs? Is this consistent with @Natkeeran experience? I will spin a new VM in a while a go from there. Promise i will have something/everything fixed for you once i know what is failing because i see it working here, but maybe it is my settings? Is Google API config working? Can you read from spreadsheets? Since the forms are so huge some times .

Natkeeran commented 7 years ago

@DiegoPino Pls see my last comment. The behavior is different for me.

DiegoPino commented 7 years ago

@Natkeeran, looking at that. looks like there is something weird there: I can see that ::addrelationshipsforupdate() method in the new branch. Are you using it already? Or am i making some reference that does not exist? I need to investigate

Natkeeran commented 7 years ago

@DiegoPino

commit 632d22b69fd214dd6be56e2a1c3f0dcb05480c6f
Author: Diego Pino Navarro <dpino@metro.org>
Date:   Fri Sep 1 13:18:22 2017 -0400

The above is the commit I am using.

DiegoPino commented 7 years ago

@Natkeeran look, should be https://github.com/mnylc/islandora_multi_importer/blob/googleAPI-Update-WIP/includes/islandora_multi_batch.inc#L877 and https://github.com/mnylc/islandora_multi_importer/blob/googleAPI-Update-WIP/includes/islandora_multi_batch.inc#L900

Not sure why it is complaining. 😢 == now i k n o w

MarcusBarnes commented 7 years ago

@DiegoPino I expect Lines 877 and 880 to work if you modify them like so:

$this->addRelationshipsforUpdate();

and

$this->addRelationshipsforIngest();.

DiegoPino commented 7 years ago

jajajaa.. oh true... i'm code blind !!!! Guess i need to look at that code again. OK, fixing. thanks. Just too much stuff.

DiegoPino commented 7 years ago

@kimpham54 @MarcusBarnes @Natkeeran. Ok, fixed. Feel so silly 😜 https://github.com/mnylc/islandora_multi_importer/commit/03f4d6a3632f7a782d6845bfcc380dcd86f88b99 Still, not sure if it will work, so you need to test. I'm stuck with fixing travis until 14:00 and then a meeting. Right now it assumes that RELS-EXT is still sued (means you are putting the same parents that you already have?) but i feel this can lead to issues. I want to add an option that simply ignores RELS-EXT for updating files, but that needs to go into the preprocessor, exactly here: https://github.com/mnylc/islandora_multi_importer/blob/googleAPI-Update-WIP/includes/islandora_multi_batch.inc#L202 Means, i need to get the actual parents that already exist (means loading parent objects, perfomance wise i don´t like it) and then cache that info and then put it back but avoid creating RELS-EXT if they are already there. So please test at discression. One way could be simply making parentship for existin PIDS in the spreadsheed happen to a new collection? which would make existing objects (only for tests!) child of two? The original one + the one in the parent column?

I have some ideas, but anything that requires thinking (like $this-> jajaja) and more than 20 minutes escapes the reality of this morning. Thanks!

Natkeeran commented 7 years ago

@DiegoPino, quick test with the latest (commit 03f4d6a3632f7a782d6845bfcc380dcd86f88b99)

[Tue Sep 12 16:59:49.696274 2017] [:error] [pid 2310] [client 10.0.2.2:36070] running getIngestInfo, referer: http://localhost:8000/multi_importer
[Tue Sep 12 16:59:49.729379 2017] [:error] [pid 2310] [client 10.0.2.2:36070] PHP Fatal error:  Call to undefined method IslandoraMultiBatchObject::inheritXacmlPoliciesForUpdate() in /var/www/drupal/sites/all/modules/islandora_multi_importer/includes/islandora_multi_batch.inc on line 905,$
 Called from /var/www/drupal/sites/all/modules/islandora_multi_importer/includes/import.form.inc, line 1195   
DiegoPino commented 7 years ago

@Natkeeran thanks. fixed https://github.com/mnylc/islandora_multi_importer/commit/f6e93f4a181bd715821366989a69dfa30a44ae03

I'm on a diff computer right now but seems like i forgot to pull some/my working branch since i could promise i have this working on vagrant!. Can you try again? sorry

Natkeeran commented 7 years ago

@DiegoPino Creates the batch sets. But, when click to process the batch sets:

[Tue Sep 12 17:28:26.377951 2017] [:error] [pid 1974] [client 10.0.2.2:36438] PHP Fatal error:  Call to undefined function batchprocessforupdate() in /var/www/drupal/sites/all/modules/islandora_multi_importer/includes/islandora_multi_batch.inc on line 403, referer: http://localhost:8000/batch?op=start&id=20
An AJAX HTTP error occurred. HTTP Result Code: 500 Debugging information follows. Path: /batch?id=21&op=do StatusText: Internal Server Error ResponseText: 

Same error for ingest.

DiegoPino commented 7 years ago

@Natkeeran same stupid UNFORGIVABLE error i made with the method calls. I simply neglected the $this. I feel this happened during a global renaming regex i did. Anyway. Fixed https://github.com/mnylc/islandora_multi_importer/commit/e068667df4efb193288eab1da6273afb1647eb58

You should be able to run the same already existing batch sets, since the class definition is the same, the only change is the implementation of the public method. Sorry sorry sorry 😿

Natkeeran commented 7 years ago

@DiegoPino

Got it to update. Very nice. :smiley: :smiley: :smiley:

No need for sorry at all. Thank you. Will do further testing and provide additional feedback.

(There seems to be an issue in setting the state after wards. Batch queue goes to error state and there is a drupal log: Failed to ingest object: islandora:20 code: 500)

DiegoPino commented 7 years ago

@Natkeeran ok, the state is not being updated.. mmm. But did the islandora:20 got updated? Looking at that now. Is there more error code there? Probably something i can do in the batch side of things

Natkeeran commented 7 years ago

@DiegoPino Yes the object got updated. No error msgs in apache error log. Do I need a particular branch of islandora_batch?

The following error might be related as well:

PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'islandora:20' for key 'PRIMARY': INSERT INTO {islandora_batch_queue} (id, parent, data, sid) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3); Array ( [:db_insert_placeholder_0] => islandora:20 [:db_insert_placeholder_1] => islandora:sp_basic_image_collection [:db_insert_placeholder_2] => O:25:"IslandoraMultiBatchObject":15:{s:11:"*baseName";s:29:"New Object via Multi importer";s:13:"*objectInfo";a:6:{s:6:"cmodel";s:24:"islandora:sp_basic_image";s:6:"parent";s:35:"islandora:sp_basic_image_collection";s:4:"data";a:29:{i:0;s:12:"islandora:20";i:1;s:35:"islandora:sp_basic_image_collection";i:2;s:24:"islandora:sp_basic_image";i:3;s:11:"image2.jpeg";i:4;d:1;i:5;s:9:"local-id2";i:6;s:4:"bbb2";i:7;s:5:"aa1bb";i:8;N;i:9;N;i:10;N;i:11;s:20:"topic2bbdddddddddddd";i:12;s:26:"Sommerville, Frederick J. ";i:13;N;i:14;s:6:"place1";i:15;s:5:"time2";i:16;N;i:17;s:15:"circa 1860-1930";i:18;s:36:"The Trustees of Sailors' Snug Harbor";i:19;N;i:20;s:159:"Photograph of Sailors' Snug Harbor resident: Frederick J. Sommerville. Inmate number 3036. Photo number 2. Photographs are albumen prints mounted on cardboard.";i:21;N;i:22;s:11:"still image";i:23;N;i:24;N;i:25;s:7:"English";i:26;s:40:"In Copyright _ Educational Use Permitted";i:27;s:10:"image/jpeg";i:28;s:19:"reformatted digital";}s:3:"pid";s:12:"islandora:20";s:13:"parent_cmodel";a:2:{i:0;s:26:"islandora:collectionCModel";i:1;s:30:"fedora-system:FedoraObject-3.0";}s:9:"namespace";s:9:"islandora";}s:15:"*currentState";i:1;s:25:"*preprocessorParameters";a:10:{s:4:"type";s:3:"ZIP";s:11:"source_data";s:2:"75";s:25:"cmodel_source_field_index";s:1:"2";s:14:"cmodel_mapping";a:1:{s:24:"islandora:sp_basic_image";a:1:{s:4:"dsid";a:1:{s:4:"rows";a:5:{s:2:"DC";a:1:{s:6:"method";a:1:{s:4:"data";s:6:"xlst|0";}}s:4:"MODS";a:1:{s:6:"method";a:1:{s:4:"data";s:24:"template|My New Template";}}s:3:"OBJ";a:1:{s:6:"method";a:1:{s:4:"data";s:0:"";}}s:2:"TN";a:1:{s:6:"method";a:1:{s:4:"data";s:0:"";}}s:11:"MEDIUM_SIZE";a:1:{s:6:"method";a:1:{s:4:"data";s:0:"";}}}}}}s:13:"object_maping";a:5:{s:10:"pidmap_row";a:2:{s:6:"pidmap";s:1:"0";s:7:"pidtype";i:0;}s:13:"parentmap_row";a:2:{s:9:"parentmap";s:1:"1";s:10:"parenttype";i:0;}s:12:"labelmap_row";a:1:{s:8:"labelmap";s:1:"3";}s:15:"sequencemap_row";a:1:{s:11:"sequencemap";s:0:"";}s:9:"dsmap_row";a:1:{s:8:"dsremote";s:3:"ZIP";}}s:16:"computed_cmodels";a:1:{s:24:"islandora:sp_basic_image";a:8:{s:2:"DC";a:3:{s:2:"id";s:2:"DC";s:4:"mime";a:1:{i:0;s:15:"application/xml";}s:8:"optional";b:0;}s:8:"RELS-EXT";a:3:{s:2:"id";s:8:"RELS-EXT";s:4:"mime";a:1:{i:0;s:19:"application/rdf+xml";}s:8:"optional";b:1;}s:8:"RELS-INT";a:3:{s:2:"id";s:8:"RELS-INT";s:4:"mime";a:1:{i:0;s:19:"application/rdf+xml";}s:8:"optional";b:1;}s:4:"MODS";a:3:{s:2:"id";s:4:"MODS";s:4:"mime";a:1:{i:0;s:15:"application/xml";}s:8:"optional";b:1;}s:6:"TECHMD";a:3:{s:2:"id";s:6:"TECHMD";s:4:"mime";a:1:{i:0;s:15:"application/xml";}s:8:"optional";b:1;}s:3:"OBJ";a:3:{s:2:"id";s:3:"OBJ";s:4:"mime";a:3:{i:0;s:10:"image/jpeg";i:1;s:9:"image/png";i:2;s:9:"image/gif";}s:8:"optional";b:1;}s:2:"TN";a:3:{s:2:"id";s:2:"TN";s:4:"mime";a:3:{i:0;s:10:"image/jpeg";i:1;s:9:"image/png";i:2;s:9:"image/gif";}s:8:"optional";b:1;}s:11:"MEDIUM_SIZE";a:3:{s:2:"id";s:11:"MEDIUM_SIZE";s:4:"mime";a:1:{i:0;s:9:"image/jpg";}s:8:"optional";b:1;}}}s:15:"source_binaries";s:2:"76";s:6:"action";s:6:"update";s:4:"data";a:1:{s:7:"headers";a:29:{i:0;s:3:"pid";i:1;s:6:"parent";i:2;s:6:"cmodel";i:3;s:8:"obj_file";i:4;s:8:"sequence";i:5;s:10:"identifier";i:6;s:5:"title";i:7;s:21:"personal_name_creator";i:8;s:22:"corporate_name_creator";i:9;s:25:"personal_name_contributor";i:10;s:26:"corporate_name_contributor";i:11;s:13:"subject_topic";i:12;s:21:"subject_personal_name";i:13;s:22:"subject_corporate_name";i:14;s:18:"subject_geographic";i:15;s:16:"subject_temporal";i:16;s:12:"date_created";i:17;s:11:"date_issued";i:18;s:9:"publisher";i:19;s:20:"place_of_publication";i:20;s:11:"description";i:21;s:4:"note";i:22;s:4:"type";i:23;s:5:"genre";i:24;s:6:"extent";i:25;s:8:"language";i:26;s:6:"rights";i:27;s:14:"digital_format";i:28;s:14:"digital_origin";}}s:7:"zipfile";s:47:"/var/www/drupal/sites/default/files/test3_6.zip";}s:17:"modsToDcTransform";s:59:"sites/all/modules/islandora_batch/transforms/mods_to_dc.xsl";s:8:"deriveDC";b:1;s:27:"*newFedoraDatastreamClass";s:28:"IslandoraNewFedoraDatastream";s:24:"*fedoraDatastreamClass";s:25:"IslandoraFedoraDatastream";s:21:"*fedoraRelsExtClass";s:22:"IslandoraFedoraRelsExt";s:14:"*datastreams";a:1:{s:8:"RELS-EXT";O:28:"IslandoraNewFedoraDatastream":8:{s:21:"*fedoraRelsIntClass";s:22:"IslandoraFedoraRelsInt";s:31:"*fedoraDatastreamVersionClass";s:32:"IslandoraFedoraDatastreamVersion";s:9:"*copied";b:0;s:10:"repository";O:25:"IslandoraFedoraRepository":6:{s:13:"*queryClass";s:24:"IslandoraRepositoryQuery";s:17:"*newObjectClass";s:24:"IslandoraNewFedoraObject";s:14:"*objectClass";s:21:"IslandoraFedoraObject";s:8:"*cache";O:20:"IslandoraSimpleCache":0:{}s:2:"ri";O:24:"IslandoraRepositoryQuery":1:{s:10:"connection";O:29:"IslandoraRepositoryConnection":11:{s:3:"url";s:28:"http://localhost:8080/fedora";s:7:"cookies";b:1;s:8:"username";s:5:"admin";s:8:"password";s:55:"$S$DbwOulVhSmsBP0cu8hj5ECLc8mZVI09VA1ywYO2spVe71PUr54UI";s:10:"verifyHost";b:1;s:10:"verifyPeer";b:1;s:7:"timeout";N;s:14:"connectTimeout";i:5;s:9:"userAgent";N;s:15:"reuseConnection";b:1;s:10:"sslVersion";N;}}s:3:"api";O:18:"IslandoraFedoraApi":3:{s:1:"a";O:10:"FedoraApiA":2:{s:13:"*connection";r:174;s:13:"*serializer";O:19:"FedoraApiSerializer":0:{}}s:1:"m";O:19:"IslandoraFedoraApiM":2:{s:10:"connection";r:174;s:10:"serializer";r:189;}s:10:"connection";r:174;}}s:6:"parent";r:1;s:13:"relationships";O:22:"IslandoraFedoraRelsInt":6:{s:10:"*aboutDs";r:164;s:6:"*new";b:0;s:11:"*domCache";N;s:10:"datastream";N;s:13:"*namespaces";a:2:{s:3:"rdf";s:43:"http://www.w3.org/1999/02/22-rdf-syntax-ns#";s:9:"islandora";s:37:"http://islandora.ca/ontology/relsint#";}s:18:"nonMagicAutoCommit";b:1;}s:15:"*datastreamId";s:8:"RELS-EXT";s:17:"*datastreamInfo";a:10:{s:14:"dsControlGroup";s:1:"X";s:7:"dsState";s:1:"A";s:7:"dsLabel";s:46:"Fedora Object to Object Relationship Metadata.";s:13:"dsVersionable";b:1;s:6:"dsMIME";s:19:"application/rdf+xml";s:11:"dsFormatURI";s:43:"info:fedora/fedora-system:FedoraRELSExt-1.0";s:14:"dsChecksumType";s:8:"DISABLED";s:10:"dsChecksum";s:4:"none";s:12:"dsLogMessage";s:0:"";s:7:"content";a:2:{s:4:"type";s:6:"string";s:7:"content";s:471:"<?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:fedora="info:fedora/fedora-system:def/relations-external#" xmlns:fedora-model="info:fedora/fedora-system:def/model#" xmlns:islandora="http://islandora.ca/ontology/relsext#"> <rdf:Description rdf:about="info:fedora/islandora:20"> <fedora:isMemberOfCollection rdf:resource="info:fedora/islandora:sp_basic_image_collection"/> </rdf:Description> </rdf:RDF> ";}}}}s:13:"relationships";O:13:"FedoraRelsExt":6:{s:6:"*new";b:0;s:11:"*domCache";N;s:10:"datastream";r:164;s:13:"*namespaces";a:4:{s:3:"rdf";s:43:"http://www.w3.org/1999/02/22-rdf-syntax-ns#";s:6:"fedora";s:49:"info:fedora/fedora-system:def/relations-external#";s:12:"fedora-model";s:36:"info:fedora/fedora-system:def/model#";s:9:"islandora";s:37:"http://islandora.ca/ontology/relsext#";}s:6:"object";r:1;s:18:"nonMagicAutoCommit";b:1;}s:10:"repository";r:168;s:11:"*objectId";s:12:"islandora:20";s:16:"*objectProfile";a:4:{s:8:"objState";s:1:"A";s:10:"objOwnerId";s:5:"admin";s:8:"objLabel";s:0:"";s:13:"objLogMessage";s:0:"";}s:9:"resources";a:0:{}} [:db_insert_placeholder_3] => 14 ) in IslandoraBatchPreprocessor->addToDatabase() (line 199 of /var/www/drupal/sites/all/modules/islandora_batch/includes/preprocessor_base.inc).
DiegoPino commented 7 years ago

Ahá. Yes. I will update that tomorrow inside Batch. There is an issue with how islandora_batch_queue handles PIDs and its unique primary KEY. Silly, but what we need really is that Primary key is in fact composite of PID+set and not only the PID. The very idea that the queue can handle just a single PID at a time is wrong and the fact that after batch we are actually saving back the object that was recently ingested/updated (this is batch by the way not multi importer) again which makes "reingesting" in case something went wrong impossible, is 100%x2 wrong. So, yeah. It does not hurt right now but you will get that update tomorrow early in the morning, its just an update hook! thanks for testing, this is great!

kimpham54 commented 7 years ago

@DiegoPino Update is working for me as well. Tried with csv upload and with Google Sheets. Really awesome stuff thank you!

I'm getting the theme_tableselect() error msg in this branch as well, but apparently @Natkeeran isn't encountering that error.