Closed kylie-m closed 7 months ago
Cannot find it. Where? What for?
@peggynewman can you comment further?
dcterms:modified
and dcterms:type
to appear? I assume it is both fields. Just at the end of the dataset
section? No particular preference? https://biocache.ala.org.au/occurrence/798c62b1-9196-49f2-8fc4-c69a59a1d95f/occurrences/search
response unless there is a specific need. Can you please explain this need? https://biocache.ala.org.au/ws/occurrences/search?q=id:798c62b1-9196-49f2-8fc4-c69a59a1d95fdcterms:modified
being in the processed
section and the raw
section seem valid to me.lastModifiedTime
in the raw section, this is the last load date into SOLR, I think. Also a lastModifiedTime
in the processed section, this is the last processed date, from pipelines I think. https://biocache.ala.org.au/ws/occurrence/798c62b1-9196-49f2-8fc4-c69a59a1d95fcc @nielsklazenga - please comment
dataset
section is finedcterms:modified
and dcterms:type
are standard Dublin Core fields and part of the DwC spec. All DwC fields should be delivered if they are provided via the UI and the JSON. Why would they not be?dcterms:modified
is processed by pipelines - looks like it might be, but I'd have to do a big query to work it outlastModifiedTime
- at least the processed occurs after the raw in the below - perhaps the raw is the pipelines processed date, and the processed is the written to solr date. Could you check this? We need to understand what these fields mean.melb herbarium record
raw: "modified": "2003-12-09 00:00:00","
raw:firstLoaded": "2019-01-10T14:38:10Z"
raw: "lastModifiedTime": "2023-11-06T13:21:01.170Z"
processed: modified": "2003-12-09T00:00"
processed: "lastModifiedTime": "2023-11-06T13:26:14.105Z",
inat record
raw: modified": "2023-10-31T05:40:11Z",
raw:firstLoaded": "2023-11-05T23:16:57.254Z",
raw: lastModifiedTime": "2023-11-07T16:08:28.626Z"
processed: "modified": "2023-10-31T05:40:11Z",
processed: "lastModifiedTime": "2023-11-07T16:08:29.159Z",
occurrences/search
has never been to return all information about all records. dcterms:modified
and dcterms:type
are returned by /occurrence/{uuid}
and I see no reason this is not sufficient. I am referring to the API, and therefore the JSON response. If the UI displayed all DwC fields for the occurrences/search
result it would look very crowded e.g. https://biocache.ala.org.au/occurrences/search?q=modified%3A*&qualityProfile=ALA&qc=-_nest_parent_%3A*&fq=family%3A%22Aberrapecidae%22#tab_recordsViewdcterms:modified
is processed. This is a fast query to check if there are raw_modified
values that failed to be processed into modified
https://biocache.ala.org.au/ws/occurrences/search?q=-modified:*&fq=raw_modified:*. We even have an assertion for it MODIFIED_DATE_INVALID.lastModifiedTime
is as I described https://github.com/AtlasOfLivingAustralia/biocache-service/blob/develop/src/main/java/au/org/ala/biocache/web/OccurrenceController.java#L1640I agree with @adam-collins. You can query on dcterms:modified
and you can also get the data through the offline download, which is all people need to do with it, I think. When I first reported (unofficially and untraceably) this issue the term was indexed but not populated.
I would like dcterms:modified
to be displayed on the record detail pages, but that should probably be reported/requested somewhere else.
Ok, we're all good then. I do think it would be useful to have the modified date displayed on the record detail page. I think folks care about when the record was last modified by the data provider.
Re the times:
addInstant(sd, raw, "lastModifiedTime", "lastLoadDate");
addInstant(sd, processed, "lastModifiedTime", "lastProcessedDate");
After some fun afternoon digging it seems that:
lastLoadDate is when the verbatim AVRO was written lastProcessedDate is when the interpreted AVRO was written
I'm curious about the difference between lastModifiedTime
in the JSON response, and how they related to these terms: https://biocache.ala.org.au/fields?filter=last
Added type
and modified
to the occurrence page, e.g. https://biocache-test.ala.org.au/occurrences/1348535d-5d2d-4314-852a-f8c991f93d23
modified
was excluded intentionally. I do not know the context.
pull requests https://github.com/AtlasOfLivingAustralia/ala-install/pull/724, https://github.com/AtlasOfLivingAustralia/biocache-hubs/pull/585, https://github.com/AtlasOfLivingAustralia/biocache-service/pull/857
Thanks @adam-collins, that is great. I do not know what the reason of keeping these terms off the page could have been. Does not make sense.
Field cannot be obtained through the Occurrence Search API endpoint (https://biocache.ala.org.au/ws/occurrences/search), but it is in the JSON output for the Occurrence Detail endpoint ([https://biocache.ala.org.au/ws/occurrences/{recordUuid}] (https://biocache.ala.org.au/ws/occurrences/%7brecordUuid%7d); weirdly enough it is in the ‘processed’ object) and can be downloaded via the Offline Download (https://biocache.ala.org.au/ws/occurrences/offline/download).
It would be good if it could be displayed on the Record Detail page (https://biocache.ala.org.au/occurrence/{recordUuid}).