Closed dagendresen closed 2 years ago
I have tested to download the DwCA directly from the NBIC IPT and made a grep for the example record, which reproduces the same error.
grep 5aac5999-9c4c-4f80-afab-bdf6636727f6 ./occurrence.txt > occ.txt
urn:uuid:5aac5999-9c4c-4f80-afab-bdf6636727f6 2015-01-22 19:05:00.0000000 218 nof so2-birds NØF-vannfugltellinger HumanObservation urn:uuid:5aac5999-9c4c-4f80-afab-bdf6636727f6 7795618
Nordre ??yeren fuglestasjon
|Erling Hobøl 19 present NOF/SO-Birds/14517042 https://www.artsobservasjoner.no/Sighting/7795618 10.5/15.0 2015 1 7 Fine obs.forhold, lavt skydekke, mildt, is i vikene og mye av Nitelva.. 456079 NORWAY Viken Viken Rælingen Årnestangen (sørspiss med oml. grunner), Nordre Øyeren Naturreservat, Rælingen, Vi 59.87099868 11.13782493 1Cygnus cygnus Animalia Chordata Aves Anseriformes Anatidae Cygnus (Linnaeus, 1758)
I have also tested to play with recoding the grep result, but without understanding the cause of the error.
iconv -f UTF-8 -t ISO-8859-1 ./occ.txt iconv -f ISO-8859-1 -t UTF-8 ./occ.txt
recode UTF8..ISO-8859-1 ./occ.txt recode ISO-8859-1..UTF8 ./occ.txt
We also noticed that Artsobservasjoner sometimes uses the HTML encoding Ø
but seemingly not in the recordedBy string in this example.
"Nordre Øyeren fuglestasjon
"
recordedBy = 'Nordre ??yeren fuglestasjon' https://www.gbif.org/occurrence/2405842938
This seems to be fixed, so I guess they did something to correct it? Anyway I think we can close this.
Many, but not all, data records for Nordre Øyeren Fuglestasjon (NØF) are presented falsely in the GBIF portal with the large "
Ø
" displayed as "??
"recordedBy = '
Nordre ??yeren fuglestasjon
' https://www.gbif.org/occurrence/2405842938Most records display the "Ø" correctly - including this one: https://www.gbif.org/occurrence/2332832871