whosonfirst-data / whosonfirst-data

Who's On First is a gazetteer of places.
http://www.whosonfirst.org/
Other
450 stars 9 forks source link

Gazetteer from Japanese Government #90

Open nyampire opened 9 years ago

nyampire commented 9 years ago

I found a gazetteer with coordination that originally published by Japan Government.

I already had inquiry to Japanese government & got clarification of situation (very similar to Public Domain, CC0). It contains "Kanji writing name", "Hiragana writing name", "Roman writing name", so maybe useful.

https://github.com/nyampire/Gazetteer_JP_2007

thisisaaronland commented 9 years ago

Yay! Thanks. I am just doing a bit of research to make sure that we can import all this data with the correct language script subtags (to distinguish Kanji and Hiragana) and then will focus on the data itself.

Whoosh!

stepps00 commented 6 years ago

WOF is lacking in localized name coverage in Japan, specifically at the county placetype. The names in this gazetteer can be added fairly easily and would greatly help name translation coverage..

Here are updates made from 2007-2018: http://www.gsi.go.jp/common/000201879.pdf

stepps00 commented 2 years ago

Original source: https://www.gsi.go.jp/ENGLISH/pape_e300284.html Terms of Use from the original source: https://www.gsi.go.jp/ENGLISH/page_e30286.html

@nyampire - I'm trying to find administrative boundaries to match to these gazetteer names on the GSI homepage, but I'm only seeing references to the Global Map project. Do you know if administrative boundaries (shapefile, geojson, etc) are also available for download? Ideally for prefectural divisions, subprefectural divisions, and municipal divisions. It would be great to update both names and geometries in WOF.

nvkelso commented 2 years ago

Statistics of Japan seems to have GIS files for download as open data:

Example:

image
nvkelso commented 2 years ago

Good reminder of Japanese administrative geography:

image

nvkelso commented 5 months ago

@justinelliotmeyers Thoughts?

justinelliotmeyers commented 5 months ago

@nvkelso I've used the stats site several times for building out locations. Typically cities can be missing from Asian datasets as either the parent admin 1 or children are built. Not the main urban area for the city. Always something to watch for in Japan and South Korea. Easy check for large cities. Mid-sized and lower populated ones you have to do a few checks to make sure you didnt skip.

Japan is a tough build to me just because there is nothing out of the box ready to plug and play in a polygon format. Points, yes probably easier. The stat data needs to be dissolved up and then split out. Lots of unique things to pay attention to.

These are just my thoughts about the build. I have spoken to @nyampire about Japan data in the past. Very knowledgeable about data, license, location types, and best way to handle things. Also very smart when it comes to address data. Currently, I don't think @openaddresses or @overturemaps are using the right level of address data.

Rant over. Japan would take me a few weeks to create. I can do it, but it would be a build. I would say, easily a 7/10 in difficulty. Just my thoughts.

nyampire commented 5 months ago

Thank you for the mention, and I'm humbled by your kind words. There has been significant progress in Japanese address datasets over the past few years.

Here are my thoughts and opinions on this matter:

  1. eStat Boundary Polygon Data

I take a cautious stance regarding the use of eStat polygon data. eStat's area polygons are primarily intended for statistical purposes. While they maintain accuracy roughly equivalent to the "chome (neighbourhood)" level in urban areas, rural areas often undergo area consolidation and omissions, making them inaccurate as true address polygons. However, due to the lack of other comprehensive open datasets covering all of Japan, startups often use them as substitutes for administrative boundary data. Therefore, my position is that "they can be registered, but must be used with a clear understanding of their characteristics."

  1. Japanese Address Dataset

In recent years, the Japanese government has been working on creating and publishing address data under the Base Registry initiative. The data is published here and is available as point data covering all of Japan. While this dataset is excellent, it faces several challenges:

https://www.digital.go.jp/policies/base_registry_address

2-1. Licensing The latest version of the Address Base Registry incorporates rural area data. The license for this rural data includes a public order clause (prohibiting data use for criminal purposes), making it non-compliant with open data standards.

This applies to the 地番マスター (Cadastral master) and 地番マスター位置参照拡張 (Cadastral master with ISJ extension) tables; if these are removed, the license becomes an Open Government License.

2-2. Data Accuracy Regarding the remaining urban area data, it is based on frontage information published by the Geospatial Information Authority of Japan (GSI). While it's relatively high-quality data, there are known instances of incorrect latitude and longitude in the frontage information.

Additionally, since frontage information serves as supplementary data for determining actual house numbers, the housenumbers might differ slightly from those actually assigned to buildings.

nvkelso commented 5 months ago

Thanks for the detailed response @nyampire!

With FOSS4G in Japan this year it'd be awesome to upgrade WOF there, but sounds not straight forward / custom build.