gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

The information about the author is imported from Naturgucker incorrectly. #5588

Open gbif-portal opened 1 week ago

gbif-portal commented 1 week ago

The information about the author is imported from Naturgucker incorrectly.

The information about the author is imported from Naturgucker incorrectly. E.g. in Darwin Core Archive data (DOI 10.15468/dl.gufn9h) for this occurence the "recordedBy" field has a numeric value, 2047627890, and the "rightsHolder" field is empty. However, immediately in NaturGucker the name of the author of this photo can be easily found.

It is important, that the corresponding links (https://nabu-naturgucker.de/?sprache=en&bild=1960016743 , https://nabu-naturgucker.de/?sprache=en&bild=-1517925867 ) are actually broken. However, these photos can be found manually and have declared authorships. E.g. on https://naturgucker.de/?bild=1960016743 the name "Brigit Kurth" is visible.

This problem is actual also for other NaturGucker-imported occurences.


User: See in registry - Send email System: Chrome 131.0.0 / Linux 0.0.0 Referer: https://www.gbif.org/occurrence/4175446464 Window size: width 1920 - height 919 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: OPERATIONAL datasetKey: 6ac3f774-d9fb-4796-b3e9-92bf6c81c084 publishingOrgKey: bb646dff-a905-4403-a49b-6d378c2cf0d9

Node handles: @jholetschek

MortenHofft commented 1 week ago

I'm not a part of the ingestion pipeline, but the fragment looks like below. I believe that is supposed to be the raw data from the published archive. There are only a numeric agent in that. And no rights holder. If that helps debugging the issue.

I also downloaded the archive listed on the dataset. And without knowing ABCD much it looks to me like all agents are listed as numeric values only there as well. Could the issue be with the archive? Or is it just me who do not know how to read it?

<Unit>
  <SourceInstitutionID>naturgucker</SourceInstitutionID>
  <SourceID>naturgucker</SourceID>
  <UnitID>-2091724221</UnitID>
  <UnitIDNumeric>-2091724221</UnitIDNumeric>
  <Identifications>
    <Identification>
      <Result>
        <TaxonIdentified>
          <HigherTaxa>
            <HigherTaxon>
              <HigherTaxonName>Insecta</HigherTaxonName>
              <HigherTaxonRank>classis</HigherTaxonRank>
            </HigherTaxon>
            <HigherTaxon>
              <HigherTaxonName>Tettigoniidae</HigherTaxonName>
              <HigherTaxonRank>familia</HigherTaxonRank>
            </HigherTaxon>
            <HigherTaxon>
              <HigherTaxonName>Animalia</HigherTaxonName>
              <HigherTaxonRank>regnum</HigherTaxonRank>
            </HigherTaxon>
            <HigherTaxon>
              <HigherTaxonName>Orthoptera</HigherTaxonName>
              <HigherTaxonRank>ordo</HigherTaxonRank>
            </HigherTaxon>
          </HigherTaxa>
          <ScientificName>
            <FullScientificNameString>Decticus verrucivorus</FullScientificNameString>
          </ScientificName>
        </TaxonIdentified>
      </Result>
    </Identification>
  </Identifications>
  <RecordBasis>HumanObservation</RecordBasis>
  <MultiMediaObjects>
    <MultiMediaObject>
      <FileURI>https://live.staticflickr.com/65535/50388654141_0ff1852064_b.jpg</FileURI>
      <ProductURI>https://nabu-naturgucker.de/?sprache=en&amp;bild=1960016743</ProductURI>
      <Format>image/jpeg</Format>
    </MultiMediaObject>
    <MultiMediaObject>
      <FileURI>https://live.staticflickr.com/65535/50387961423_27ba76a6a1_b.jpg</FileURI>
      <ProductURI>https://nabu-naturgucker.de/?sprache=en&amp;bild=-1517925867</ProductURI>
      <Format>image/jpeg</Format>
    </MultiMediaObject>
  </MultiMediaObjects>
  <Gathering>
    <DateTime>
      <ISODateTimeBegin>2020-09-06T00:00:00</ISODateTimeBegin>
    </DateTime>
    <Agents>
      <GatheringAgent>
        <AgentText>2047627890</AgentText>
      </GatheringAgent>
    </Agents>
    <LocalityText>Mont Aigoual (Cev30)</LocalityText>
    <Country>
      <ISO3166Code>FR</ISO3166Code>
    </Country>
    <WMSURL>http://www.enjoynature.net/?gebiet=605024048</WMSURL>
    <SiteCoordinateSets>
      <SiteCoordinates>
        <CoordinatesLatLong>
          <LongitudeDecimal>3.57469797134</LongitudeDecimal>
          <LatitudeDecimal>44.1220054626</LatitudeDecimal>
          <CoordinateErrorDistanceInMeters>250</CoordinateErrorDistanceInMeters>
        </CoordinatesLatLong>
      </SiteCoordinates>
    </SiteCoordinateSets>
  </Gathering>
</Unit>
jholetschek commented 1 week ago

It is intended that the name of the observer is not provided to GBIF. It is shown on Naturgucker, but the curators chose not to publish them to GBIF for privacy reasons.

We indeed plan to add a copyright statement for the images - I'm currently waiting for the maintainer of BioCASe to add the respective ABCD field.