Open utnapischtim opened 1 month ago
Yes approach 3) makes sense to me. To also account for the missing
side bug, the possible absence of any title and/or lang, and split up some responsibilities in that file a bit more, the following pattern can be extended:
from marshmallow import Schema, fields, missing
class FundingReferenceSchema(Schema):
# fill with other fields ...
awardTitle = fields.Method("get_title", data_key="title")
def get_title(self, obj):
"""Serialize title
:param obj: A metadata.funding dict
:type obj: dict
:return: serialized funding title
:rtype: dict or missing
"""
titles_by_lang = (e for e in obj.get("title", {}).items())
title = next(titles_by_lang, missing)
return {title[0], title[1]} if title != missing else title
class DataCite43Schema(Schema):
fundingReferences = fields.Method("get_funding")
def get_funding(self, obj):
funding_list = obj["metadata"].get("funding", [])
return FundingReferenceSchema().dump(funding_list, many=True)
obj = {
"metadata": {
"funding": [
{
"title": {
"en": "My title"
}
}
]
}
}
result = DataCite43Schema().dump(obj)
print(f"result: {result}")
# result: {'fundingReferences': [{'title': 'My title'}]}
obj = {
"metadata": {
"funding": [
{
"title": {}
}
]
}
}
result = DataCite43Schema().dump(obj)
print(f"result: {result}")
# result: {'fundingReferences': [{}]}
obj = {
"metadata": {
"funding": [
{
"title": {
"de": "Mein titel"
}
}
]
}
}
result = DataCite43Schema().dump(obj)
print(f"result: {result}")
# result: {'fundingReferences': [{'title': {'de': 'Mein titel'}}]}
The locale
part should NOT be changeable by the user at this time.
We implemented it to prepare the data model for the future (and avoid running data transformation/db changes in future upgrades, given that the user profile is a JSON blob) but, in the UI, it should be read-only.
I doubt there was any testing done related to locale.
The locale could/should be used for displaying purposes, but not for registering a DOI: the metadata sent should be complete and should use the exact user input, and not the current local.
i think i found then the error.
the serializer and deserializer in the FundingField uses the defaultLocale
set here
and this is the commit which introduced that change. the link is to react-invenio-deposit
because the code has been moved to invenio-rdm-records
a year ago
Package version (if known): current
Describe the bug
If a user sets
Preferences Locale
to a noten
value the worker can't create a valid json and the DOI registration on Datacite fails.Steps to Reproduce
Preferences locale
to a noten
value e.g.de
Expected behavior
doi will be registered
Problem
the
Preferences locale
setting changes the default language value which is used to store the award title. it should look like{'title': {'en': 'fasr'}, 'number': '2345'}
but it looks like{'title': {'de': 'fasr'}, 'number': '2345'}
. this is then a problem here the award title could not be found because there is noen
intitle
.to be fair, this could still create a valid object but it doesn't the nested
missing
will not be removed and therefore thejson.dumps
here will fail withTypeError: Object of type _Missing is not JSON serializable
.solutions
1.) the solution to use
current_i18n.locale
which is used for the default locale here is not possible because the record dump to register the doi is done in the worker which is not aware of the user settings.2.) remove the feature that the award title is language aware and change the serialize function accordingly
3.) change the serializer to
funding_ref["awardTitle"] = [award_title for _, award_title in award.get("title", {}).items()][0]
side bugs
on local development if the developer stops the instance and starts again the error is gone because the instance looses the
preference locale
setting. which seems is a side bugthe nested
_missing
should be removed so that thejson.dumps
does not fail