Solution:
Modifying the normalize_id method, which is used in multiple places, is not feasible due to its widespread use. However, since this issue specifically pertains to the affiliation_identifier, we have introduced a modification within the get_affiliations method to address this particular scenario.
Inside get_affiliations method we added if statement which will check if the identifier is present and its not URL then create a value for that identifier explicitly.
Another minor fix added here while fixing the original issue:scheme_uri = a["SchemeURI"] here character S was capital, and XML metadata file contains the small case. Because of which scheme_uri was getting returned in the response for some properties.
Open Questions and Pre-Merge TODOs
Learning
Types of changes
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Be humble in the language and feedback you give, ask don't tell.
Consider using positive language as opposed to neutral when offering feedback. This is to avoid the negative bias that can occur with neutral language appearing negative.
Offer suggestions on how to improve code e.g. simplification or expanding clarity.
Ensure you give reasons for the changes you are proposing.
Purpose
Preserve Non-URL Affiliation Identifiers for example:
affiliation_identifier = "05bp8ka05".
closes: https://github.com/datacite/bolognese/issues/152
Approach
Root Cause: The problem arises during the normalization of affiliation identifiers. Currently, the normalization process checks if an identifier is a URL. If it's not a URL, the process returns nil, resulting in the removal of affiliation identifiers that do not start with a URL. https://github.com/datacite/bolognese/blob/b0a7df3c9dd6a45eaf56fd0e06d304e4db9b837d/lib/bolognese/utils.rb#L649
Solution: Modifying the
normalize_id
method, which is used in multiple places, is not feasible due to its widespread use. However, since this issue specifically pertains to theaffiliation_identifier
, we have introduced a modification within theget_affiliations
method to address this particular scenario.Inside
get_affiliations
method we added if statement which will check if the identifier is present and its not URL then create a value for that identifier explicitly.How to reproduce locally:
I have added detailed explanation here. https://github.com/datacite/bolognese/issues/152#issuecomment-1729422712
Another minor fix added here while fixing the original issue:
scheme_uri = a["SchemeURI"]
here characterS
was capital, and XML metadata file contains the small case. Because of whichscheme_uri
was getting returned in the response for some properties.Open Questions and Pre-Merge TODOs
Learning
Types of changes
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Reviewer, please remember our guidelines: