4Science / DSpace

This repository contains the 4Science optimized DSpace & DSpace-CRIS distribution.
https://wiki.lyrasis.org/display/DSPACECRIS/
BSD 3-Clause "New" or "Revised" License
42 stars 61 forks source link

Getting page ranges from Scopus does not work properly #316

Closed olli-gold closed 1 year ago

olli-gold commented 1 year ago

Describe the bug In DSpace-CRIS 2022.03.00 it's not possible to take page ranges from Scopus into the approriate fields on DSpace side. There is a Contributor, that is supposed to do that, but it's not clear, how it can be used and everything I tried yet was not successfull at all.

To Reproduce You may use Scopus ID 2-s2.0-85144600378 for testing this.

This is the configuration I have been using in config/spring/api/scopus-integration.xml:

<entry key-ref="scopusPageRangeContrib" value-ref="scopusPageRangeContrib"/>
[...]
    <bean id="scopusPageRangeContrib" class="org.dspace.importer.external.metadatamapping.contributor.PageRangeXPathMetadataContributor">
        <property name="startPageMetadata" ref="scopus.startPage"/>
        <property name="endPageMetadata" ref="scopus.endPage"/>
        <property name="query" value="prism:pageRange"/>
        <property name="prefixToNamespaceMapping" ref="scopusPrism"/>
    </bean>
    <bean id="scopus.startPage" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.startPage"/>
    </bean>

    <bean id="scopus.endPage" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.endPage"/>
    </bean>

Doing that results in a error 500 once the backend is restarted. The error comes due to the missing field definition for scopusPageRangeContrib (as this is not a MetadataFieldConfig, but rather a PageRangeXPathMetadataContributor).

When I change that with something like this:

        <entry key-ref="scopus.pages" value-ref="scopusPageRangeContrib"/>
[...]
    <bean id="scopusPageRangeContrib" class="org.dspace.importer.external.metadatamapping.contributor.PageRangeXPathMetadataContributor">
        <property name="field" ref="scopus.pages"/>
        <property name="startPageMetadata" ref="scopus.startPage"/>
        <property name="endPageMetadata" ref="scopus.endPage"/>
        <property name="query" value="prism:pageRange"/>
        <property name="prefixToNamespaceMapping" ref="scopusPrism"/>
    </bean>
    <bean id="scopus.startPage" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.startPage"/>
    </bean>

    <bean id="scopus.endPage" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.endPage"/>
    </bean>

    <bean id="scopus.pages" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.pages"/>
    </bean>

This is not causing a 500 error anymore, but it's also not working properly: I do not see, that the page range is set into the appropriate fields, and neither the complete range is imported to oaire.citation.pages, as far as I saw that.

Expected behavior I would expect the Contributor to set the values into the appropriate fields oaire.citation.startPage and oaire.citation.endPage.

Related work I am not aware of related tickets or works to this.

corrad82-4s commented 1 year ago

Hello @olli-gold , I used this configuration


<bean id="scopusPageRangeContrib" class="org.dspace.importer.external.metadatamapping.contributor.PageRangeXPathMetadataContributor">
        <property name="field" ref="scopus.startPage"/>
        <property name="startPageMetadata" ref="scopus.startPage"/>
        <property name="endPageMetadata" ref="scopus.endPage"/>
        <property name="query" value="pageRange"/>
        <property name="prefixToNamespaceMapping" ref="scopusPrism"/>
    </bean>

    <bean id="scopus.startPage" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.startPage"/>
    </bean>

    <bean id="scopus.endPage" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
        <constructor-arg value="oaire.citation.endPage"/>
    </bean>

Quite similar to yours and was able to get pagination metadata populated. Tried with the same id you reported in this issue, this was the query performed on Scopus
`https://api.elsevier.com/content/search/scopus?query=EID(2-s2.0-85144600378)&start=0&count=25` 

Could you test the same query with your Scopus API key and check if <pageRange> node is part of the response?

Thank you
olli-gold commented 1 year ago

Thank you, @corrad82-4s - I compared your configuration to mine and got it working now. My failure was in the configuration setting <property name="field" ref="scopus.startPage"/>. Using a different name there (for instance <property name="field" ref="scopus.pages"/>) will not work, even if it's referenced correctly. So, I am closing this ticket, as it was obviously just a matter of configuration. However, I would propose to put this configuration snippet into the default configuration file of DSpace-CRIS as it's probably not self-explaining...