AtlasOfLivingAustralia / biocache-service

Occurrence & mapping webservices
https://biocache-ws.ala.org.au/ws/
Other
9 stars 26 forks source link

Raw values in occurrences have information from processed values #841

Closed charvolant closed 3 months ago

charvolant commented 9 months ago

See for example https://api.ala.org.au/occurrences/occurrences/da76bbe0-0539-4051-bf08-9080a9f12775

This record has an invalid name match caused by misprocessing and difficulty parsing the supplied name. However, it shows another error where the derived subspecies is inserted into the raw data. This seems to be coming from the service, rather than the SOLR index.

The originally supplied data is

{
  "id": "NSW316781",
  "coreRowType": "http://rs.tdwg.org/dwc/terms/Occurrence",
  "coreTerms": {
    "http://rs.tdwg.org/dwc/terms/disposition": "in collection",
    "http://rs.tdwg.org/dwc/terms/preparations": "sheet",
    "http://rs.tdwg.org/dwc/terms/country": "AUSTRALIA",
    "http://rs.tdwg.org/dwc/terms/habitat": "In a patch of Eucalyptus nitens regrowth.",
    "http://rs.tdwg.org/dwc/terms/collectionCode": "NSW",
    "http://rs.tdwg.org/dwc/terms/taxonRank": "species",
    "http://rs.tdwg.org/dwc/terms/verbatimCoordinateSystem": "Degrees Minutes",
    "http://rs.tdwg.org/dwc/terms/recordNumber": "461",
    "http://rs.tdwg.org/dwc/terms/locality": "headwaters of Bonang River, 2.5 km W of Gunmark [Goonmirk ?] Road & Errinundra Rd junction along Errinundra Rd, N of road (low side)",
    "http://rs.tdwg.org/dwc/terms/verbatimLatitude": "37 18 S",
    "http://rs.tdwg.org/dwc/terms/basisOfRecord": "PreservedSpecimen",
    "http://rs.tdwg.org/dwc/terms/family": "Myrtaceae",
    "http://purl.org/dc/terms/modified": "2022-01-17T20:41:58",
    "http://rs.tdwg.org/dwc/terms/decimalLatitude": "-37.30",
    "http://rs.tdwg.org/dwc/terms/scientificName": "Eucalyptus `hammonds rd'",
    "http://rs.tdwg.org/dwc/terms/recordedBy": "Chesterfield, E.A.",
    "http://rs.tdwg.org/dwc/terms/stateProvince": "Victoria",
    "http://rs.tdwg.org/dwc/terms/genus": "Eucalyptus",
    "http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters": "10000",
    "http://rs.tdwg.org/dwc/terms/specificEpithet": "`hammonds rd'",
    "http://rs.tdwg.org/dwc/terms/occurrenceID": "NSW:NSW:NSW316781",
    "http://rs.tdwg.org/dwc/terms/eventDate": "1984-05-22",
    "http://rs.tdwg.org/dwc/terms/verbatimTaxonRank": "species",
    "http://rs.tdwg.org/dwc/terms/verbatimLongitude": "148 48 E",
    "http://rs.tdwg.org/dwc/terms/nomenclaturalCode": "ICN",
    "http://rs.tdwg.org/dwc/terms/catalogNumber": "NSW316781",
    "http://rs.tdwg.org/dwc/terms/establishmentMeans": "native",
    "http://rs.tdwg.org/dwc/terms/occurrenceRemarks": "Initially mistaken for Eucalyptus nitens with basal bark very similar to that species, bark on bole with greenish tinge of E. viminalis. This species is reputed (F. Morris - Overseer Orbost district) to cover 30- 40 ha on a flat in the Delegate River, compartment 501, block 3, where it grows with E. radiata.",
    "http://rs.tdwg.org/dwc/terms/reproductiveCondition": "buds|fruits",
    "http://rs.tdwg.org/dwc/terms/decimalLongitude": "148.80",
    "http://rs.tdwg.org/dwc/terms/institutionCode": "NSW",
    "http://rs.tdwg.org/dwc/terms/verbatimCoordinates": "37 18 S, 148 48 E",
    "http://rs.tdwg.org/dwc/terms/occurrenceStatus": "present"
  },
  "extensions": {
    "http://data.ggbn.org/schemas/ggbn/terms/Loan": [],
    "http://rs.gbif.org/terms/1.0/Multimedia": [],
    "http://rs.tdwg.org/dwc/terms/ResourceRelationship": []
  }
}

The information in the solr index is

{
  "responseHeader": {
    "zkConnected": true,
    "status": 0,
    "QTime": 14,
    "params": {
      "q": "id:\"da76bbe0-0539-4051-bf08-9080a9f12775\"",
      "q.op": "OR"
    }
  },
  "response": {
    "numFound": 1,
    "start": 0,
    "maxScore": 7.3651896,
    "numFoundExact": true,
    "docs": [
      {
        "id": "da76bbe0-0539-4051-bf08-9080a9f12775",
        "country": "Australia",
        "raw_eventDate": "1984-05-22",
        "raw_locality": "headwaters of Bonang River, 2.5 km W of Gunmark [Goonmirk ?] Road & Errinundra Rd junction along Errinundra Rd, N of road (low side)",
        "habitat": "In a patch of Eucalyptus nitens regrowth.",
        "point-0.02": "-37.3,148.8",
        "point-0.01": "-37.3,148.8",
        "scientificName": "Eucalyptus pauciflora subsp. debeuzevillei",
        "matchType": "canonicalMatch",
        "lat_long": "-37.3,148.8",
        "geohash": "-37.3,148.8",
        "location": "-37.3,148.8",
        "quad": "-37.3,148.8",
        "packedQuad": "-37.3,148.8",
        "establishmentMeans": "native",
        "raw_stateConservation": "Critically Endangered",
        "type": "PhysicalObject",
        "raw_family": "Myrtaceae",
        "phylumID": "https://id.biodiversity.org.au/taxon/apni/51414458",
        "familyID": "https://id.biodiversity.org.au/taxon/apni/51376810",
        "occurrenceStatus": "PRESENT",
        "catalogNumber": "NSW316781",
        "basisOfRecord": "PRESERVED_SPECIMEN",
        "raw_scientificName": "Eucalyptus `hammonds rd'",
        "taxonConceptID": "https://id.biodiversity.org.au/node/apni/2896227",
        "point-0.1": "-37.3,148.8",
        "modified": "2022-01-17T20:41:58",
        "raw_modified": "2022-01-17T20:41:58",
        "raw_establishmentMeans": "native",
        "reproductiveCondition": "buds|fruits",
        "order": "Myrtales",
        "dataResourceName": "NSW AVH feed",
        "recordNumber": "461",
        "raw_basisOfRecord": "PreservedSpecimen",
        "locality": "headwaters of Bonang River, 2.5 km W of Gunmark [Goonmirk ?] Road & Errinundra Rd junction along Errinundra Rd, N of road (low side)",
        "raw_taxonRank": "species",
        "stateProvince": "Victoria",
        "speciesID": "https://id.biodiversity.org.au/node/apni/2897845",
        "collectionCode": "NSW",
        "point-1": "-37,149",
        "occurrenceID": "NSW:NSW:NSW316781",
        "point-0.0001": "-37.3,148.8",
        "raw_recordedBy": "Chesterfield, E.A.",
        "verbatimLatitude": "37 18 S",
        "license": "CC-BY 4.0 (Int)",
        "dataResourceUid": "dr15861",
        "genus": "Eucalyptus",
        "biome": "TERRESTRIAL",
        "subspecies": "Eucalyptus pauciflora subsp. debeuzevillei",
        "common_name_and_lsid": "Jounama Snow Gum|Eucalyptus pauciflora subsp. debeuzevillei|https://id.biodiversity.org.au/node/apni/2896227|Jounama Snow Gum|Plantae|Myrtaceae",
        "scientificNameAuthorship": "(Maiden) L.A.S.Johnson & Blaxell",
        "taxonRank": "subspecies",
        "raw_coordinateUncertaintyInMeters": "10000",
        "genusID": "https://id.biodiversity.org.au/taxon/apni/51360942",
        "collectionName": "National Herbarium of New South Wales",
        "raw_preparations": "sheet",
        "nameType": "SCIENTIFIC",
        "vernacularName": "Jounama Snow Gum",
        "provenance": "Published dataset",
        "raw_decimalLatitude": "-37.30",
        "institutionCode": "NSW",
        "countryCode": "AU",
        "verbatimLongitude": "148 48 E",
        "class": "Equisetopsida",
        "raw_country": "AUSTRALIA",
        "collectionUid": "co54",
        "raw_genus": "Eucalyptus",
        "nomenclaturalCode": "ICN",
        "raw_decimalLongitude": "148.80",
        "orderID": "https://id.biodiversity.org.au/taxon/apni/51376809",
        "names_and_lsid": "Eucalyptus pauciflora subsp. debeuzevillei|https://id.biodiversity.org.au/node/apni/2896227|Jounama Snow Gum|Plantae|Myrtaceae",
        "point-0.001": "-37.3,148.8",
        "verbatimCoordinateSystem": "Degrees Minutes",
        "geodeticDatum": "EPSG:4326",
        "kingdom": "Plantae",
        "specificEpithet": "`hammonds rd'",
        "raw_occurrenceStatus": "present",
        "classID": "https://id.biodiversity.org.au/taxon/apni/51414457",
        "dataProviderUid": "dp36",
        "disposition": "in collection",
        "phylum": "Charophyta",
        "datePrecision": "DAY",
        "raw_stateProvince": "Victoria",
        "species": "Eucalyptus pauciflora",
        "institutionUid": "in50",
        "dataProviderName": "Australia's Virtual Herbarium",
        "verbatimCoordinates": "37 18 S, 148 48 E",
        "institutionName": "The Royal Botanic Gardens & Domain Trust",
        "subspeciesID": "https://id.biodiversity.org.au/node/apni/2896227",
        "occurrenceRemarks": "Initially mistaken for Eucalyptus nitens with basal bark very similar to that species, bark on bole with greenish tinge of E. viminalis. This species is reputed (F. Morris - Overseer Orbost district) to cover 30- 40 ha on a flat in the Delegate River, compartment 501, block 3, where it grows with E. radiata.",
        "stateConservation": "Critically Endangered",
        "family": "Myrtaceae",
        "kingdomID": "https://id.biodiversity.org.au/taxon/apni/51414459",
        "verbatimTaxonRank": "species",
        "cl10936": "Outer Regional Australia",
        "cl110944": "Remote and Natural Area - Schedule 6, National Parks Act",
        "cl10933": "GIPPSLAND",
        "cl410927": "88020",
        "cl10935": "VICTORIA EXC. MELBOURNE",
        "cl10934": "EAST GIPPSLAND",
        "cl927": "Victoria (including Coastal Waters)",
        "cl310927": "2.4899539112931600000000",
        "cl1058": "South East Coast (Victoria)",
        "cl10930": "East Gippsland",
        "cl1059": "SNOWY RIVER",
        "cl111033": "National Park",
        "cl990": "Atlas of Life in the Coastal Wilderness",
        "cl210927": "79080",
        "cl10903": "Nature Conservation Reserve",
        "cl10900": "Non-Indigenous, Native forest",
        "cl10944": "Brodribb",
        "cl10943": "LATROBE - GIPPSLAND",
        "cl10902": "Eucalypt Tall Open",
        "cl10946": "East Gippsland",
        "cl2013": "Victoria",
        "cl916": "East Gippsland",
        "cl959": "East Gippsland (S)",
        "cl10942": "GIPPSLAND - EAST",
        "cl10941": "ORBOST",
        "cl1048": "South Eastern Highlands",
        "cl1049": "Kybeyan-Gourock",
        "cl11033": "Errinundra",
        "cl510927": "2.7714433898839600000000",
        "cl1918": "Primarily Vegetated Natural & Semi-Natural Terrestrial Vegetation Woody Trees Closed",
        "cl110928": "0.0006297303771610000000",
        "cl23": "E. Gippsland - Orbost",
        "cl22": "Victoria",
        "cl110923": "EAST GIPPSLAND",
        "cl110922": "Legislative Council",
        "cl20": "South East Corner",
        "cl110927": "0.0607374948771430000000",
        "cl110925": "VIC",
        "cl620": "Eucalyptus tall open forest",
        "cl2125": "Eucalypt Tall Open Forests",
        "cl2124": "Eucalyptus (+/- tall) open forest with a dense broad-leaved and/or tree-fern understorey (wet sclerophyll)",
        "cl2049": "GER Great Eastern Ranges Initiative",
        "cl10929": "REST OF VIC.",
        "cl10925": "VICTORIA",
        "cl932": "Australia",
        "cl10928": "20",
        "cl10927": "1929",
        "cl10922": "EASTERN VICTORIA",
        "cl10921": "GIPPSLAND",
        "cl10923": "EAST GIPPSLAND SHIRE",
        "cl1068": "GER National Corridor",
        "cl10000": "Eucalypt Tall Open",
        "cl617": "Eucalypt tall open forests",
        "decimalLongitude": 148.8,
        "decimalLatitude": -37.3,
        "distanceFromExpertDistribution": -1,
        "coordinateUncertaintyInMeters": 10000,
        "el790": 48,
        "el891": 0.89,
        "el890": 15.5,
        "el893": 1196,
        "el870": 9.6,
        "el892": 1.42,
        "el674": 731,
        "el894": 21.9,
        "el872": 15,
        "el875": 15.5,
        "el874": 10.3,
        "el876": 5.2,
        "el879": 22.8,
        "el878": 243,
        "el882": 17,
        "el881": 14.8,
        "el862": 22.3,
        "el883": 0.46,
        "el886": 351,
        "el863": 326,
        "el888": 10.2,
        "el866": 30,
        "el865": 1,
        "el887": 42,
        "el867": 0.5,
        "el889": 243,
        "el10978": 10.55,
        "outlierLayerCount": 0,
        "taxonRankID": 8000,
        "decade": 1980,
        "month": 5,
        "year": 1984,
        "lft": 567705,
        "day": 22,
        "rgt": 567705,
        "firstLoadedDate": "2021-06-26T06:00:59.158Z",
        "lastLoadDate": "2023-10-03T23:08:57.626Z",
        "lastProcessedDate": "2023-10-04T01:43:47.685Z",
        "occurrenceYear": [
          "1984-01-01T00:00:00Z"
        ],
        "occurrence_year": [
          "1984-01-01T00:00:00Z"
        ],
        "eventDate": "1984-05-22T00:00:00Z",
        "isInCluster": false,
        "spatiallyValid": true,
        "defaultValuesUsed": true,
        "preparations": [
          "sheet"
        ],
        "recordedBy": [
          "Chesterfield, E.A."
        ],
        "geospatialIssues": [
          "GEODETIC_DATUM_ASSUMED_WGS84",
          "MISSING_GEODETICDATUM",
          "MISSING_GEOREFERENCE_DATE",
          "MISSING_GEOREFERENCEDBY",
          "MISSING_GEOREFERENCEPROTOCOL",
          "MISSING_GEOREFERENCESOURCES",
          "MISSING_GEOREFERENCEVERIFICATIONSTATUS"
        ],
        "speciesSubgroup": [
          "Dicots",
          "Flowering plants"
        ],
        "speciesListUid": [
          "dr655"
        ],
        "speciesGroup": [
          "Plants",
          "Angiosperms",
          "Dicots"
        ],
        "dataHubUid": [
          "dh9"
        ],
        "assertions": [
          "GEODETIC_DATUM_ASSUMED_WGS84",
          "MISSING_GEODETICDATUM",
          "MISSING_GEOREFERENCE_DATE",
          "MISSING_GEOREFERENCEDBY",
          "MISSING_GEOREFERENCEPROTOCOL",
          "MISSING_GEOREFERENCESOURCES",
          "MISSING_GEOREFERENCEVERIFICATIONSTATUS"
        ],
        "contentTypes": [
          "point occurrence data"
        ],
        "_root_": "da76bbe0-0539-4051-bf08-9080a9f12775"
      }
    ]
  }
}

There is no raw_subspecies in the solr document

The data returned by the API call, with assertions removed for brevity is

{
  "raw": {
    "rowKey": "da76bbe0-0539-4051-bf08-9080a9f12775",
    "uuid": "da76bbe0-0539-4051-bf08-9080a9f12775",
    "occurrence": {
      "establishmentMeans": "native",
      "catalogNumber": "NSW316781",
      "occurrenceStatus": "present",
      "basisOfRecord": "PreservedSpecimen",
      "modified": "2022-01-17T20:41:58",
      "reproductiveCondition": "buds|fruits",
      "recordNumber": "461",
      "collectionCode": "NSW",
      "occurrenceID": "NSW:NSW:NSW316781",
      "preparations": "sheet",
      "institutionCode": "NSW",
      "disposition": "in collection",
      "recordedBy": "Chesterfield, E.A.",
      "occurrenceRemarks": "Initially mistaken for Eucalyptus nitens with basal bark very similar to that species, bark on bole with greenish tinge of E. viminalis. This species is reputed (F. Morris - Overseer Orbost district) to cover 30- 40 ha on a flat in the Delegate River, compartment 501, block 3, where it grows with E. radiata.",
      "stateConservation": "Critically Endangered"
    },
    "classification": {
      "scientificName": "Eucalyptus `hammonds rd'",
      "genus": "Eucalyptus",
      "subspecies": "Eucalyptus pauciflora subsp. debeuzevillei",
      "taxonRank": "species",
      "nomenclaturalCode": "ICN",
      "specificEpithet": "`hammonds rd'",
      "subspeciesID": "https://id.biodiversity.org.au/node/apni/2896227",
      "family": "Myrtaceae",
      "verbatimTaxonRank": "species"
    },
    "location": {
      "country": "AUSTRALIA",
      "habitat": "In a patch of Eucalyptus nitens regrowth.",
      "decimalLatitude": "-37.30",
      "terrestrial": true,
      "locality": "headwaters of Bonang River, 2.5 km W of Gunmark [Goonmirk ?] Road & Errinundra Rd junction along Errinundra Rd, N of road (low side)",
      "decimalLongitude": "148.80",
      "stateProvince": "Victoria",
      "verbatimLatitude": "37 18 S",
      "coordinateUncertaintyInMeters": "10000",
      "marine": false,
      "verbatimLongitude": "148 48 E",
      "verbatimCoordinateSystem": "Degrees Minutes"
    },
    "event": {
      "eventDate": "1984-05-22"
    },
    "attribution": {
      "dataResourceUid": "dr15861",
      "dataHubUid": [
        "dh9"
      ]
    },
    "identification": {},
    "measurement": {},
    "assertions": [
      "GEODETIC_DATUM_ASSUMED_WGS84",
      "MISSING_GEODETICDATUM",
      "MISSING_GEOREFERENCE_DATE",
      "MISSING_GEOREFERENCEDBY",
      "MISSING_GEOREFERENCEPROTOCOL",
      "MISSING_GEOREFERENCESOURCES",
      "MISSING_GEOREFERENCEVERIFICATIONSTATUS"
    ],
    "miscProperties": {},
    "queryAssertions": {},
    "defaultValuesUsed": true,
    "spatiallyValid": true,
    "geospatiallyKosher": true,
    "taxonomicallyKosher": "",
    "deleted": false,
    "firstLoaded": "2021-06-26T06:00:59.158Z",
    "dateDeleted": "",
    "lastModifiedTime": "2023-10-03T23:08:57.626Z"
  },
  "processed": {
    "rowKey": "da76bbe0-0539-4051-bf08-9080a9f12775",
    "uuid": "da76bbe0-0539-4051-bf08-9080a9f12775",
    "occurrence": {
      "establishmentMeans": "native",
      "occurrenceStatus": "PRESENT",
      "basisOfRecord": "PRESERVED_SPECIMEN",
      "modified": "2022-01-17T20:41:58",
      "recordedBy": [
        "Chesterfield, E.A."
      ],
      "stateConservation": "Critically Endangered"
    },
    "classification": {
      "scientificName": "Eucalyptus pauciflora subsp. debeuzevillei",
      "matchType": "canonicalMatch",
      "phylumID": "https://id.biodiversity.org.au/taxon/apni/51414458",
      "familyID": "https://id.biodiversity.org.au/taxon/apni/51376810",
      "taxonConceptID": "https://id.biodiversity.org.au/node/apni/2896227",
      "order": "Myrtales",
      "taxonRankID": 8000,
      "speciesID": "https://id.biodiversity.org.au/node/apni/2897845",
      "genus": "Eucalyptus",
      "left": 567705,
      "scientificNameAuthorship": "(Maiden) L.A.S.Johnson & Blaxell",
      "taxonRank": "subspecies",
      "genusID": "https://id.biodiversity.org.au/taxon/apni/51360942",
      "nameType": "SCIENTIFIC",
      "vernacularName": "Jounama Snow Gum",
      "orderID": "https://id.biodiversity.org.au/taxon/apni/51376809",
      "right": 567705,
      "kingdom": "Plantae",
      "classID": "https://id.biodiversity.org.au/taxon/apni/51414457",
      "phylum": "Charophyta",
      "classs": "Equisetopsida",
      "species": "Eucalyptus pauciflora",
      "family": "Myrtaceae",
      "kingdomID": "https://id.biodiversity.org.au/taxon/apni/51414459"
    },
    "location": {
      "country": "Australia",
      "decimalLatitude": -37.3,
      "terrestrial": true,
      "locality": "headwaters of Bonang River, 2.5 km W of Gunmark [Goonmirk ?] Road & Errinundra Rd junction along Errinundra Rd, N of road (low side)",
      "decimalLongitude": 148.8,
      "stateProvince": "Victoria",
      "biome": "TERRESTRIAL",
      "coordinateUncertaintyInMeters": 10000,
      "marine": false,
      "countryCode": "AU",
      "geodeticDatum": "EPSG:4326",
      "verbatimCoordinates": "37 18 S, 148 48 E"
    },
    "event": {
      "year": 1984,
      "month": 5,
      "datePrecision": "DAY",
      "day": 22,
      "eventDate": "1984-05-22"
    },
    "attribution": {
      "dataResourceName": "NSW AVH feed",
      "collectionName": "National Herbarium of New South Wales",
      "license": "CC-BY 4.0 (Int)",
      "dataProviderUid": "dp36",
      "provenance": "Published dataset",
      "dataResourceUid": "dr15861",
      "institutionUid": "in50",
      "dataProviderName": "Australia's Virtual Herbarium",
      "institutionName": "The Royal Botanic Gardens & Domain Trust",
      "collectionUid": "co54"
    },
    "identification": {},
    "measurement": {},
    "miscProperties": {},
    "queryAssertions": {},
    "geospatiallyKosher": true,
    "taxonomicallyKosher": "",
    "deleted": false,
    "dateDeleted": "",
    "lastModifiedTime": "2023-10-04T01:43:47.685Z",
    "el": {
      "el790": 48,
      "el891": 0.89,
      "el890": 15.5,
      "el893": 1196,
      "el870": 9.6,
      "el892": 1.42,
      "el674": 731,
      "el894": 21.9,
      "el872": 15,
      "el875": 15.5,
      "el874": 10.3,
      "el876": 5.2,
      "el879": 22.8,
      "el878": 243,
      "el882": 17,
      "el881": 14.8,
      "el862": 22.3,
      "el883": 0.46,
      "el886": 351,
      "el863": 326,
      "el888": 10.2,
      "el866": 30,
      "el865": 1,
      "el887": 42,
      "el867": 0.5,
      "el889": 243,
      "el10978": 10.55
    },
    "cl": {
      "cl10936": "Outer Regional Australia",
      "cl110944": "Remote and Natural Area - Schedule 6, National Parks Act",
      "cl10933": "GIPPSLAND",
      "cl410927": "88020",
      "cl10935": "VICTORIA EXC. MELBOURNE",
      "cl10934": "EAST GIPPSLAND",
      "cl927": "Victoria (including Coastal Waters)",
      "cl310927": "2.4899539112931600000000",
      "cl1058": "South East Coast (Victoria)",
      "cl10930": "East Gippsland",
      "cl1059": "SNOWY RIVER",
      "cl111033": "National Park",
      "cl990": "Atlas of Life in the Coastal Wilderness",
      "cl210927": "79080",
      "cl10903": "Nature Conservation Reserve",
      "cl10900": "Non-Indigenous, Native forest",
      "cl10944": "Brodribb",
      "cl10943": "LATROBE - GIPPSLAND",
      "cl10902": "Eucalypt Tall Open",
      "cl10946": "East Gippsland",
      "cl2013": "Victoria",
      "cl916": "East Gippsland",
      "cl959": "East Gippsland (S)",
      "cl10942": "GIPPSLAND - EAST",
      "cl10941": "ORBOST",
      "cl1048": "South Eastern Highlands",
      "cl1049": "Kybeyan-Gourock",
      "cl11033": "Errinundra",
      "cl510927": "2.7714433898839600000000",
      "cl1918": "Primarily Vegetated Natural & Semi-Natural Terrestrial Vegetation Woody Trees Closed",
      "cl110928": "0.0006297303771610000000",
      "cl23": "E. Gippsland - Orbost",
      "cl22": "Victoria",
      "cl110923": "EAST GIPPSLAND",
      "cl110922": "Legislative Council",
      "cl20": "South East Corner",
      "cl110927": "0.0607374948771430000000",
      "cl110925": "VIC",
      "cl620": "Eucalyptus tall open forest",
      "cl2125": "Eucalypt Tall Open Forests",
      "cl2124": "Eucalyptus (+/- tall) open forest with a dense broad-leaved and/or tree-fern understorey (wet sclerophyll)",
      "cl2049": "GER Great Eastern Ranges Initiative",
      "cl10929": "REST OF VIC.",
      "cl10925": "VICTORIA",
      "cl932": "Australia",
      "cl10928": "20",
      "cl10927": "1929",
      "cl10922": "EASTERN VICTORIA",
      "cl10921": "GIPPSLAND",
      "cl10923": "EAST GIPPSLAND SHIRE",
      "cl1068": "GER National Corridor",
      "cl10000": "Eucalypt Tall Open",
      "cl617": "Eucalypt tall open forests"
    }
  },
...

raw.classification.subspecies and raw.classification.subspeciesID contain values not in the original data.

adam-collins commented 8 months ago

It has been fine like this for a long time and I do worry such a change will break something in biocache-hubs.

Not to mention that if a raw field is absent the non-raw field is used intentionally in the RAW section.

It is more useful to deprecate this output format and version a format consistent with the download format, i.e. a format capable of listing of all fields in a flat structure that can reference index/fields for further information.

Of course we could ignore everything else inconsistent and just fix these 2 fields, and do all of this again next time someone raises an issue of any of the other inconsistencies.

adam-collins commented 7 months ago

Only moving subspecies and subspeciesID. pull request https://github.com/AtlasOfLivingAustralia/biocache-service/pull/864

peggynewman commented 3 months ago

Nefarious processed subspecies content is no longer appearing in raw: https://biocache-ws-test.ala.org.au/ws/occurrence/da76bbe0-0539-4051-bf08-9080a9f12775 It would be nice to review all of the raw/processed data fields and we are likely to hit on this at some stage soon. Happy to leave it at this. @nielsklazenga with the Darwin Core Compliance work, we should add that we want to review what comprises raw/processed values.