ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

geography cleanup request: Japan #3435

Closed dustymc closed 2 years ago

dustymc commented 3 years ago

Our data are not consistent.

https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010183 has no island data, but seems to be entirely on Honshu.

Many - perhaps most! - references seem inappropriate, eg https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010311 is a political division with a geographic reference.

Kuro-Shima is hyphenated, most other things are not.

There are a bunch of districts - https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010182 - which I don't think are geography at all.

Etc. etc. etc.

Help!

dustymc commented 3 years ago

See also https://github.com/ArctosDB/arctos/issues/3422#issuecomment-779978191

sharpphyl commented 3 years ago

Island countries such as Japan, Philippines, Indonesia are exceedingly difficult because they often have many more levels, in this case Region (Chugoku) containing multiple prefectures, Prefecture, District (eliminated in 1921 but still listed as current in Wikipedia), Archipelago, Island Group (sometimes more than one), and Island.

Can we start by deciding what is the Japanese equivalent to State/Province and County?

Japan is divided into eight (non-administrative) regions. Only one is in our higher geography - Chugoku. Do we want to start with regions or with Prefectures? If we start with regions, then all the Prefectures would move under County and we would not list any Subprefectures (currently we have six listed). If we start with Prefectures, then the Subprefectures go in the country column which is probably more valuable information.

Yes, Kuro-Shima should not be hyphenated. Can we just edit it? https://en.wikipedia.org/wiki/Kuroshima_(Kagoshima)

Many - perhaps most! - references seem inappropriate, eg https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010311 is a political division with a geographic reference.

This is Awaji Island which looks accurate to me. https://en.wikipedia.org/wiki/Awaji_Island

There are a bunch of districts - https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010182 - which I don't think are geography at all.

Your reference is Nakagami District, Okinawa which we use. You're correct that districts were eliminated as administrative units in 1921 but are still used and came up as currently valid in Wikipedia. Is there a better source? https://en.wikipedia.org/wiki/Nakagami_District,_Okinawa. There are no subprefectures on Okinawa.

Chugoku is a region, but it's in the same category with prefectures.
The Chūgoku region, also known as the San'in-San'yō, is the westernmost region of Honshū, the largest island of Japan. It consists of the prefectures of Hiroshima, Okayama, Shimane, Tottori, and Yamaguchi. In 2010, it had a population of 7,563,428.

It looks like there are over 7,000 records from Japan in Arctos and over 1,000 of those are in our marine collection. I'm willing to work with someone who has Japan records to clean this up.

--

tucotuco commented 3 years ago

Terse response, but for non-terse reasons I can't back up just now. Follow GADM.

On Wed, Feb 17, 2021 at 12:47 PM Phyllis Sharp notifications@github.com wrote:

Island countries such as Japan, Philippines, Indonesia are exceedingly difficult because they often have many more levels, in this case Region (Chugoku) containing multiple prefectures, Prefecture, District (eliminated in 1921 but still listed as current in Wikipedia), Archipelago, Island Group (sometimes more than one), and Island.

Can we start by deciding what is the Japanese equivalent to State/Province and County?

Japan is divided into eight (non-administrative) regions. Only one is in our higher geography - Chugoku. Do we want to start with regions or with Prefectures? If we start with regions, then all the Prefectures would move under County and we would not list any Subprefectures (currently we have six listed). If we start with Prefectures, then the Subprefectures go in the country column which is probably more valuable information.

Yes, Kuro-Shima should not be hyphenated. Can we just edit it? https://en.wikipedia.org/wiki/Kuroshima_(Kagoshima)

Many - perhaps most! - references seem inappropriate, eg https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010311 is a political division with a geographic reference.

This is Awaji Island which looks accurate to me. https://en.wikipedia.org/wiki/Awaji_Island

There are a bunch of districts - https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10010182

  • which I don't think are geography at all.

Your reference is Nakagami District, Okinawa which we use. You're correct that districts were eliminated as administrative units in 1921 but are still used and came up as currently valid in Wikipedia. Is there a better source? https://en.wikipedia.org/wiki/Nakagami_District,_Okinawa. There are no subprefectures on Okinawa.

Chugoku is a region, but it's in the same category with prefectures. The Chūgoku region, also known as the San'in-San'yō, is the westernmost region of Honshū, the largest island of Japan. It consists of the prefectures of Hiroshima, Okayama, Shimane, Tottori, and Yamaguchi. In 2010, it had a population of 7,563,428.

It looks like there are over 7,000 records from Japan in Arctos and over 1,000 of those are in our marine collection. I'm willing to work with someone who has Japan records to clean this up.

--

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/3435#issuecomment-780650412, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADQ727YGGQRDG4VIYRUT73S7PQK3ANCNFSM4XWYSAJQ .

dustymc commented 3 years ago

Follow GADM.

One radical way of accomplishing this: Consider everything in geog_auth_rec "curatorial assertions" which will always be inconsistent and therefore simply cannot be useful for things like searching, don't worry about arbitrarily classified subdivisions or inconsistent transliterations (ain't like one more layer of arbitrary makes much difference from there...), instead do more with https://github.com/ArctosDB/arctos/issues/3272. I've got "standardized" data for most everything in Arctos at this point, I could - well, whatever, share it with GBIF, make it more accessible in Arctos, WHATEVER.

sharpphyl commented 3 years ago

GADM and marineregions.org EEZs?

So would this be what I would use for Okinawa higher geography for marine specimens?

https://marineregions.org/gazetteer.php?p=details&id=9041

dustymc commented 2 years ago

Tabling - see also https://github.com/ArctosDB/arctos/issues/3272 , https://github.com/ArctosDB/arctos/issues/2374