inveniosoftware / invenio-rdm-records

DataCite-based data model for InvenioRDM flavour.
https://invenio-rdm-records.readthedocs.io
MIT License
15 stars 87 forks source link

Preferences Locale Problem #1849

Open utnapischtim opened 2 weeks ago

utnapischtim commented 2 weeks ago

Package version (if known): current

Describe the bug

If a user sets Preferences Locale to a not en value the worker can't create a valid json and the DOI registration on Datacite fails.

Steps to Reproduce

  1. go the the user menu page
  2. set the Preferences locale to a not en value e.g. de
  3. create a new record
  4. add a custom award
  5. add a custom title to the award
  6. try to publish (which works, doi creation fails)

Expected behavior

doi will be registered

Problem

the Preferences locale setting changes the default language value which is used to store the award title. it should look like {'title': {'en': 'fasr'}, 'number': '2345'} but it looks like {'title': {'de': 'fasr'}, 'number': '2345'} . this is then a problem here the award title could not be found because there is no en in title.

to be fair, this could still create a valid object but it doesn't the nested missing will not be removed and therefore the json.dumps here will fail with TypeError: Object of type _Missing is not JSON serializable.

solutions

1.) the solution to use current_i18n.locale which is used for the default locale here is not possible because the record dump to register the doi is done in the worker which is not aware of the user settings.

2.) remove the feature that the award title is language aware and change the serialize function accordingly

3.) change the serializer to funding_ref["awardTitle"] = [award_title for _, award_title in award.get("title", {}).items()][0]

side bugs

fenekku commented 2 weeks ago

Yes approach 3) makes sense to me. To also account for the missing side bug, the possible absence of any title and/or lang, and split up some responsibilities in that file a bit more, the following pattern can be extended:


from marshmallow import Schema, fields, missing

class FundingReferenceSchema(Schema):
    # fill with other fields ...
    awardTitle = fields.Method("get_title", data_key="title")

    def get_title(self, obj):
        """Serialize title

        :param obj: A metadata.funding dict
        :type obj: dict
        :return: serialized funding title
        :rtype: dict or missing
        """
        titles_by_lang = (e for e in obj.get("title", {}).items())
        title = next(titles_by_lang, missing)
        return {title[0], title[1]} if title != missing else title 

class DataCite43Schema(Schema):
    fundingReferences = fields.Method("get_funding")

    def get_funding(self, obj):
        funding_list = obj["metadata"].get("funding", [])
        return FundingReferenceSchema().dump(funding_list, many=True)

obj = {
    "metadata": {
        "funding": [
            {
                "title": {
                    "en": "My title"
                }
            }
        ]
    }
}
result = DataCite43Schema().dump(obj)
print(f"result: {result}")
# result: {'fundingReferences': [{'title': 'My title'}]}

obj = {
    "metadata": {
        "funding": [
            {
                "title": {}
            }
        ]
    }
}
result = DataCite43Schema().dump(obj)
print(f"result: {result}")
# result: {'fundingReferences': [{}]}

obj = {
    "metadata": {
        "funding": [
            {
                "title": {
                    "de": "Mein titel"
                }
            }
        ]
    }
}
result = DataCite43Schema().dump(obj)
print(f"result: {result}")
# result: {'fundingReferences': [{'title': {'de': 'Mein titel'}}]}
ntarocco commented 2 weeks ago

The locale part should NOT be changeable by the user at this time. We implemented it to prepare the data model for the future (and avoid running data transformation/db changes in future upgrades, given that the user profile is a JSON blob) but, in the UI, it should be read-only. I doubt there was any testing done related to locale.

The locale could/should be used for displaying purposes, but not for registering a DOI: the metadata sent should be complete and should use the exact user input, and not the current local.

utnapischtim commented 2 weeks ago

i think i found then the error.

the serializer and deserializer in the FundingField uses the defaultLocale set here

and this is the commit which introduced that change. the link is to react-invenio-deposit because the code has been moved to invenio-rdm-records a year ago