scientist-softserv / atla-hyku

Other
0 stars 0 forks source link

Bulkrax Export & Roundtrip issues & questions #130

Open ckarpinski opened 11 months ago

ckarpinski commented 11 months ago

i re exported and removed the page works you can see the duplicates all were created as GenericWorks (i didnt have a work type column, nothing said this was required and i did not intend to change it. I think the issue here is it created new instead of updated. I found in the documentation that for NEW works if you dont include work type it will use generic) https://docs.google.com/spreadsheets/d/1MCxl6neVOfzTBr3Kqifm6-1YRJ_V9eNvqS5N473fBcg/edit?usp=sharing

Here is what i used to reimport https://docs.google.com/spreadsheets/d/1yqsTtamYUb8CL5iJFjl5471bhCO70ZQHGKO6S9MB1rg/edit?usp=sharing

TESTING on dev and staging - i created 3 new imports with errors, it still created the works that did not have errors. It used to and is suppose to fail the entire importer if there are any errors.

Questions

orangewolf commented 10 months ago
ckarpinski commented 10 months ago

Previously I am pretty sure (i tested it) all of our required fields had to be present for the import to be successful AND our controlled vocabs fields had to use the controlled vocab. Meaning:

Required must be there or it fails, controlled vocab must be used or it fails for both required and optional

Source id - I think if someone creates a work and gives it a source id they created it needs to keep that source id. I think it would be confusing to have it change. OR would it be that they dont have to create one and the importer assigns one that they can then see when its exported? I would prefer not to have to make source identifiers in the work form - random student adding work, seems like an odd thing for them to create.

LAST ONE - yes, crystal specifically asked me about this and I swear i tested it and it worked. We want the import to fail so that they do it correctly. If onyl the wrong work fails they may not realize there was an error. So the plan was fail the import, they go see what the error is and fix it. When this was first set up i tested this and it worked. I used the same importer test recently (mentione above) and it did not work that way any longer.

ckarpinski commented 10 months ago

NOTES from slack

orangewolf commented 10 months ago
orangewolf commented 10 months ago

run rake cleanup:source_identifier when this code goes to production

crisr15 commented 10 months ago

Passes internal QA:
Works with incorrect validations are failing. Items with the same source identifier/ID are not duplicating when reimported.
https://crystal.atla-hyku.notch8.cloud/importers?locale=en

ckarpinski commented 10 months ago

Questions

ckarpinski commented 10 months ago

When updating existing works by CSV:

This seems problematic - Is this a known issue?

Follow up - i exported all the works created and it created a duplicate work wth the exact same source id- because it did not have the model field included in the update . you can see an example highlighted here https://docs.google.com/spreadsheets/d/18occJ3dr0VQiTzq3ifwQDm6kzP0E-HCcXbNUpFDlTqc/edit?usp=sharing

Green highlighted was suppose to update but instead created a new work with the same source id - because it did not have the model field included in the update

this was on staging here https://demo.atla-hyku.notch8.cloud/catalog?utf8=%E2%9C%93&search_field=all_fields&q=