omeka-s-modules / Omeka2Importer

Import items from an Omeka 2 site into Omeka S
GNU General Public License v3.0
3 stars 4 forks source link

Testing Collections Fixes et al. #96

Closed patrickmj closed 7 years ago

patrickmj commented 7 years ago

This testing round involves efforts to address #89 #92 #95 and other quirks we've discovered.

If it's possible to confirm that those issues are fixed, that is best in life.

It adds an 'Update previous import' option. If checked, new data will overwrite the old data. If not, previously imported items will be left untouched. New data from the original site will be added.

So, there are a lot of variables going on, including, but not limited to

There used to be an effort to apply dcterms:isPartOf relations. I've removed that since it adds a level of complication that gets too deep into the weeds (and I'm pretty sure never worked right anyway).

The list of use cases gets pretty long, so I don't want to try to recreate everything I can imagine. Just go with the various sites you have, and see if it roughly works as expected as you try out the permutations of import options: 1) Item Set assignment settings 2) Import Collections setting 3) Update setting

Omeka2Import branch: develop Omeka S tag: v1.0.0-beta2 Origin Omeka Classic: anything with the API open

patrickmj commented 7 years ago

There's a regression somewhere that breaks develop, so for now this should wait.

Now fixed

mebrett commented 7 years ago

Is it essential that core be on master? When I switch to master, all the modules display Error: invalid config/module.ini file (see http://dev.omeka.org/omeka-s/admin/module)

zerocrates commented 7 years ago

Master shouldn't be required but 1.0.0-beta2 would be.

patrickmj commented 7 years ago

I've updated the original issue to reflect using tag v1.0.0-beta2

mebrett commented 7 years ago

Confirming that 'Update previous import' overwrites changes on S with any changes from the Classic install

mebrett commented 7 years ago

Tried importing from http://cornishmemory.com/api with Import Collections checked, and the collections were added as item sets but no items were imported. (Fair warning, that's a rather large site)

Edited to add: the import worked on a Classic site, although it took a few hours (see above comment about size)

mebrett commented 7 years ago

Changes to the metadata for a collection are not imported. I've tried checking both "Update" and "Import Collections" and just "Update" and the changes don't come through.

mebrett commented 7 years ago

For the record, the problem where GD thumbnailer exceeds memory persists (see logs for Jobs 367 and 368 for example). This relates to server setup, not the importer specifically, but it would be useful to note in the documentation (maybe in a troubleshooting section) that file manager configuration and server configuration can cause errors.

mebrett commented 7 years ago

Question about undoing imports and the Update feature. If I want to remove all the items imported from a site, the easiest way is to undo the import; do I have to find all instances of importing from that site, or (if I have consistently checked the "update" box) do I only have to delete one job in order to undo the entire import?

patrickmj commented 7 years ago

Good question. I think that it should work as follows: if you undo an import, it will delete any items that came in from that import. If, after that import, you added more items to the origin site and re-imported, those will only the deleted upon undoing that later import.

mebrett commented 7 years ago

Thanks - I'll test that out.

mebrett commented 7 years ago

Undoing a import does delete all the items, but not the item sets, created by that import

patrickmj commented 7 years ago

Interesting side-note on cornishhistory. They have their API set to return 1000 records per page, much more than the default. Kinda surprised that this hasn't caused more errors on our end, actually. Also reflects @jajm's request in #56

patrickmj commented 7 years ago

@mebrett in your import from cornish history that didn't import items, what was the status of the import on the past imports page? And, what's in the log, if anything?

mebrett commented 7 years ago

No log: http://dev.omeka.org/omeka-s/admin/job/349

patrickmj commented 7 years ago

Did you subsequently delete the item sets that were created on the import from cornish history?

mebrett commented 7 years ago

Yes, manually. Do you want me to run it again?

patrickmj commented 7 years ago

blergh....sounds like a lot of manual deletions! Might be premature to run it again. I got errors in the log when I ran it, so I'd like to see if I can sort that out and/or reproduce it first.

patrickmj commented 7 years ago

@mebrett The develop branch of Omeka2Import now has a "Per Page" option, so you can tell it to only try to grab a certain number of records for each request. This was instead of waiting for 1000 records from cornish history, you can force it to only request 10 or whatever.

This should at least make it clear whether it is actually progressing, and hopefully speed up getting data into the logs.

patrickmj commented 7 years ago

So far, I'm only getting things like network timeouts and GD or ImageMagick complications. Those are relevant for us to deal with somehow, but not huge bugs. On the Omeka2Importer side, it looks like figuring out how to handle network timeouts is the direction for now.

mebrett commented 7 years ago

Per page seems to be resolving the issue with the Cornish Memory Api. Thousands of items slowly being added.

patrickmj commented 7 years ago

Hooray! I'm guessing a network timeout or running out of resources was the culprit.

mebrett commented 7 years ago

For the most part, everything works.

What doesn't, quite:

  1. Changes to the metadata for a collection are not imported when updating a previous import, even with "import collections" also checked.
  2. Undoing an import does not remove the item sets created from its collections.
patrickmj commented 7 years ago

Excellent!

The thing about removing item sets created from collections in 2 is tricky. There's always the possibility of importing and creating item sets from collections, then adding other items to the item set natively in Omeka S, then undoing an import. That would destroy lots of structure, so I want to avoid that. Instead, that should be something people will have to manage on their sites -- hopefully with the help of a batch delete!

The lack of changes to collection data not being imported is a bug, now #97. Any details about how to reproduce it much appreciated.

mebrett commented 7 years ago

Pulled to the latest version of the importer and tried to import from my demo site: with key & import collections (Job 392); with key and not importing collections (393); and without key or collections (394). All returned the following error:

2017-03-17T19:05:53+00:00 INFO (6): Importing item page 1 Fatal error: Argument 4 passed to Omeka\Api\Manager::batchCreate() must be of the type array, boolean given, called in /websites/omekadev/home/www/omeka-s/modules/Omeka2Importer/src/Job/Import.php on line 226 and defined in /websites/omekadev/home/www/omeka-s/application/src/Api/Manager.php on line 93

Nothing was imported.

Edited to add apparently related to a change in the develop branch of master. Will roll back a bit and keep testing on that.

mebrett commented 7 years ago

Just checked importer against v1.0.0-beta3 (importer on develop) and everything worked fine, including updating changes to collection metadata.