This PR adds support for indexing and querying publisher object values in Elasticsearch as well as support for Instrument and StudyRegistration resourceTypeGeneral values in Elasticsearch queries. Adds support for querying organizations with matching ROR publisherIdentifiers via the GraphQL API. Adds a test for 4.5 XML at /dois/validate endpoint.
Adds a new publisher_obj attribute to the ES mapping in the Doi model. The existing publisher attribute in the mapping remains unchanged. If query strings for publisher subproperties are detected (e.g. publisher.publisherIdentifier), a regex gsub replaces publisher with publisher_obj so that the correct attribute is queried. No changes to the query_fields are necessary because the publisher property will continue to exist and be queryable.
The serializer is modified to check for the existence of publisher_obj, which will be present when querying ES but not when retrieving single Doi objects from the database.
Modifies the organization_id attribute to include ROR publisherIdentifiers if available. Organization queries in GraphQL already search this field, so no other modifications are necessary to include DOIs with matching publisherIdentifiers in an organization's list of works. As a consequence, DOIs that are related to DMPs with a matching publisherIdentifier will also appear in an organization's list of works.
Adds Instrument and StudyRegistration to resourceTypeGeneral lists for query faceting and the /resource-types endpoint.
Adds tests for resource-type-id filter for new Instrument and StudyRegistration resourceTypes.
Open Questions and Pre-Merge TODOs
[ ] Stage, test, and production will need to be re-indexed.
[ ] From my analysis and local testing, this should be deployable without re-indexing in advance of release. After re-indexing on master and switching over to schema-4.5-r3 locally, I was able to create and update ES documents without updating the mapping in the active dois index, and the serializer did not crash when processing documents created/updated before and after the branch switchover. Of course, once released, we would still need to update the mapping and re-index so the new publisher/publisher_obj values are queryable. Does this seem like the right assessment?
Learning
Types of changes
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Be humble in the language and feedback you give, ask don't tell.
Consider using positive language as opposed to neutral when offering feedback. This is to avoid the negative bias that can occur with neutral language appearing negative.
Offer suggestions on how to improve code e.g. simplification or expanding clarity.
Ensure you give reasons for the changes you are proposing.
Purpose
This PR adds support for indexing and querying publisher object values in Elasticsearch as well as support for Instrument and StudyRegistration resourceTypeGeneral values in Elasticsearch queries. Adds support for querying organizations with matching ROR
publisherIdentifier
s via the GraphQL API. Adds a test for 4.5 XML at /dois/validate endpoint.closes: #1012 #1020 datacite/akita#311 datacite/akita#298
Approach
Adds a new
publisher_obj
attribute to the ES mapping in the Doi model. The existingpublisher
attribute in the mapping remains unchanged. If query strings for publisher subproperties are detected (e.g.publisher.publisherIdentifier
), a regex gsub replacespublisher
withpublisher_obj
so that the correct attribute is queried. No changes to thequery_fields
are necessary because thepublisher
property will continue to exist and be queryable.The serializer is modified to check for the existence of
publisher_obj
, which will be present when querying ES but not when retrieving single Doi objects from the database.Modifies the
organization_id
attribute to include RORpublisherIdentifiers
if available. Organization queries in GraphQL already search this field, so no other modifications are necessary to include DOIs with matchingpublisherIdentifiers
in an organization's list of works. As a consequence, DOIs that are related to DMPs with a matchingpublisherIdentifier
will also appear in an organization's list of works.Adds Instrument and StudyRegistration to resourceTypeGeneral lists for query faceting and the
/resource-types
endpoint.Adds tests for
resource-type-id
filter for new Instrument and StudyRegistration resourceTypes.Open Questions and Pre-Merge TODOs
publisher
/publisher_obj
values are queryable. Does this seem like the right assessment?Learning
Types of changes
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Reviewer, please remember our guidelines: