eprints / orcid_support_advance

ORCID Support Advance plugin
1 stars 6 forks source link

WP2: Exporting to orcid.org #12

Open wfyson opened 6 years ago

wfyson commented 6 years ago

Pushing to orcid.org to be reimplemented using an EPrint commit trigger. For each creator listed in the EPrint, check to see if they have given permission to write to their orcid.org profile. If the eprint is in the live archive, and a field relating to those stored in orcid.org has been updated, push the record to orcid.org

Items with an appropriate orcid.org put-code should updated on orcid.org (i.e. PUT request). New items should be added (POST request). Newly created items on orcid.org should return their put-code, for updating the record on EPrints.

Automatic use of trigger should be configurable at a user account level (or completely optional for repositories not wanting to use it).

dennmuel commented 6 years ago

We're kind of working on the POST/PUT part at the moment and would be interested in some info about this. Does the order of the response tags from orcid (error:error or work:work) to a bulk request correspond to the order of the commited works? (So that we could reliably the put code to the export item and display a different message or save it for a PUT request.) If no, do you think it would be a practical solution to move away from bulk in order to export and check the return code for each item seperately? I'm sure this would work generally, but I fear that this would make the export rather slow. Do you have any thoughts on this?

wfyson commented 6 years ago

The plan at present is to change this from a bulk export to processing each item separately. This will help us get around the POST/PUT issues you describe above and will allow us to allow these exports to happen automatically if we implement it as an EPrint commit trigger. Hopefully this shouldn't make the export too slow as only those records that need exporting to orcid.org should be exported.

dennmuel commented 6 years ago

Thanks for the info, @wfyson :) I switched from bulking to seperate calls. Looking for the put-code of a sucessfully POSTed item I discovered that the XML response from the orcid sandbox api was empty. That does not happen with PUT or unsuccesful POST attempts. Luckily, we're still able to get the put-code via the location field in the HTTP header, but I was wondering if you have come across this as well. It would be nicer to directly parse an XML response instead of using some perl regex to retrieve the put-code from the location url.

EDIT: ORCID told me, that this is expected behaviour and one would need to issue a separate GET request to get the XML response containing the put-code. However, one would already need to have the put-code to issue a valid GET request in the first place. I'll just stick with the regex, then...

For the case of items having already been posted to ORCID but not having saved the put-code: If we try to post them again, a 409 response is sent. This, however, does not contain the put-code in any form. So there is currently no way to update this kind of record without having to delete it in ORCID. The ORCID support team was so kind as to add the request of returning the put-code in a 409 response to their short term development road map. They hope to be able to add that feature in the next month or two.

dennmuel commented 6 years ago

Hey @wfyson , do you know a way, how we could link each putcode of an item to the orcid or userid of the user it belongs to? We would need that to enable updating records in ORCID via put-requests for more than one user. Ideally we would just add a orcid/userid column to the eprint_orcid_put_codes table, but I wouldn't know how to do that without directly manipulating the databse with sql. Is it possible to somehow enhance the respective multiple field? Our current solution is to save the putcode in eprint_orcid_put_codes as well as in a new multiple field / table user_orcid_work_put_codes and compare the contents of those and use the put code that appears in both tables. That is neither elegant nor efficient, so any help would be appreciated. :)

wfyson commented 6 years ago

Hi @dennmuel ,

We might be able to store the put codes in the creators table. We'd need to modify wherever the 'creators' field is defined and add a new put_code field so that it looks like the following:

{
        name => 'creators',
        type => 'compound',
        multiple => 1,
        fields => [
                {
                        sub_name => 'name',
                        type => 'name',
                        hide_honourific => 1,
                        hide_lineage => 1,
                        family_first => 1,
                        render_single_value => 'render_initialised_name',
                },
                {
                        sub_name => 'id',
                        type => 'text',
                        input_cols => 20,
                        allow_null => 1,
                },
                {
                        sub_name => 'orcid',
                        type => 'orcid',
                        input_cols => 19,
                        allow_null => 1,
                },
                {
                        sub_name => 'put-code',
                        type => 'text',
                        allow_null => 1,
                        show_in_html => 0, #we don't need this field to appear in the workflow
                        export_as_xml => 0, #nor do we want it appearing in exports
                },
        ],
        input_boxes => 4,
},

With the above we can have a put code stored next to each creator with an ORCID at the eprint level, and we can look up users using either their ORCID or the value in the 'ID' column. However I am a bit wary of adding more fields to the creators table as it can get a little complicated at times, especially when adding a field via a bazaar plugin - it'll make the installation a bit more complicated, but then we already face this challenge with the regular ORCID Support plugin!

dennmuel commented 6 years ago

Hey @wfyson thanks for the quick reply, I'll play around with that approach and get back to you (hopefully with some pull requests) soon.

dennmuel commented 6 years ago

Hi @wfyson sorry for double posting, but I think I need some clarification on this.

Is it correct, that this approach would render the existing eprint-field orcid_put_code obsolete? I guess we could save the putcode twice, but I don't think there is an operation with the existing field that we couldn't do with the new subfield.

As you pointed out, adding subfields to the creators table might be tricky. Couldn't we instead turn the existing putcode field into a compound field having subfields for the putcode and the ORCID id?

Either way, how would we handle the case of instances that have already imported and exported using the current version? Might there be cases where switching to the subfield would cause problems in identifying duplicates or something similar?

wfyson commented 5 years ago

Updating of existing works via put-code is now available in v1.5, but updating other creators orcid.org records still needs looking into.