I'm not sure, if this can be considered as a bug or if it's rather a feature request... I'll file it as a bug for now, though.
Describe the bug
If I use the default configuration of DSpace-CRIS 2022.03.01 for Scopus import, the subject keywords will be imported as a string, in which the single subject keywords are separated by a |.
Trying to solve that, we have noticed problems with the SplitMetadataContributor in an import scenario for Scopus (we discussed that on Slack recently). Goal is to split the subject keywords from Scopus into separate fields, separated by a pipe symbol |. This could be done using the SplitMetadataContributor in scopus-integration.xml. But unfortunately this does not work well with the SimpleXpathMetadatumContributor as the innerContributor.
In bibtex-integration.xml there is a working example for the SplitMetadataContributor (https://github.com/4Science/DSpace/blob/dspace-cris-7/dspace/config/spring/api/bibtex-integration.xml#L43-L51), but this one is using just a SimpleMetadataContributor as the innerContributor, which is addressing the target field in a different way and an additional bean is not required for the mapping (in contrast to the SimpleXpathMetadatumContributor).
To Reproduce
Steps to reproduce the behavior:
Use the configuration snippet beyond in scopus-integration.xml
Import a record with multiple subject keywords (for example Scopus ID 2-s2.0-18644372692)
Nothing will happen: the keywords from Scopus will not be taken
My best guess is, that this cannot work, because the Field mapping is done within a bean, which is not visible for the innerContributor of scopusAuthkeywordsContrib.
Expected behavior
I would expect to have all of the subject keyword in separate dc.subject fields.
Related work
I am not aware of any related PR about this.
I'm not sure, if this can be considered as a bug or if it's rather a feature request... I'll file it as a bug for now, though.
Describe the bug If I use the default configuration of DSpace-CRIS 2022.03.01 for Scopus import, the subject keywords will be imported as a string, in which the single subject keywords are separated by a
|
.Trying to solve that, we have noticed problems with the
SplitMetadataContributor
in an import scenario for Scopus (we discussed that on Slack recently). Goal is to split the subject keywords from Scopus into separate fields, separated by a pipe symbol|
. This could be done using theSplitMetadataContributor
inscopus-integration.xml
. But unfortunately this does not work well with theSimpleXpathMetadatumContributor
as theinnerContributor
. Inbibtex-integration.xml
there is a working example for theSplitMetadataContributor
(https://github.com/4Science/DSpace/blob/dspace-cris-7/dspace/config/spring/api/bibtex-integration.xml#L43-L51), but this one is using just aSimpleMetadataContributor
as theinnerContributor
, which is addressing the target field in a different way and an additional bean is not required for the mapping (in contrast to theSimpleXpathMetadatumContributor
).To Reproduce Steps to reproduce the behavior:
Configuration snippet
scopus-integration.xml
:My best guess is, that this cannot work, because the Field mapping is done within a bean, which is not visible for the
innerContributor
ofscopusAuthkeywordsContrib
.Expected behavior I would expect to have all of the subject keyword in separate
dc.subject
fields.Related work I am not aware of any related PR about this.