jlacko / RCzechia

A package providing Czech shapefiles - LAU & NUTS regions, municipalities, rivers etc. - in R friendly format for analysis & visualization
https://rczechia.jla-data.net
Other
24 stars 6 forks source link

Add download links to readme #77

Closed do-me closed 11 months ago

do-me commented 1 year ago

Thanks a lot for your work! Projects like this are the rescue for many :)

It would be great if you could add the hyperlinks for the datasets directly in the readme so folks like me who are looking for the data but don't have R installed can access them too (e.g. https://rczechia.jla-data.net/zip-high-2021-02.rds etc.).

As a side note: by June 9 2024 all EU states will be forced to open many public data sets, like administrative geometries and more!

image

jlacko commented 1 year ago

Thanks for your kind words! I will look into this.

I am kind of puzzled - in a good way, so don't take it as offensive, just curious: are you, without access to R, able to open RDS files? I lived under the impression that these are R specific (but you live & learn, always...). Which is why I hid them, and event he domain is on purpose vague. This can be remedied.

Relating your side note: many admin areas have been published in open format already - though this is not the case with ZIP codes which you linked which are not outcome of state administrative but postal (and the Czech post office is kinda sorta private in nature and thus not subject to the directive in question). As is often the case the open data are buried deep and poorly documented, but can be had in non-R specific format when you know where to look for. This may fit your needs better than RDS files / but then again I know only a little about your needs, so ... but as I have invested a lot of time in searching with rather varying success I will be happy to share what I got.

do-me commented 1 year ago

No, I actually wasn't able to open RDS files in a different way, so I ended up installing R (+R Studio) and just downloading the data from there as gpkg like this (if anyone might have the same needs): 

install.packages("RCzechia")
library(RCzechia)
zip_data <- zip_codes()
geopackage_file <- "zip_data.gpkg"
st_write(zip_data, geopackage_file, layer = "zip_codes", driver = "GPKG")

It's still fairly simple but as you're repo is the sort of "go to" for Czech data it'd be nice to have these files (or maybe geoparquet or gzipped geojson files) directly available as download if your s3 bucket permits.

It's the same story with the German post, used to be public, now private and not quite openly releasing the zip codes everyone is relying on... but at least all the public data will be released at that date!

jlacko commented 1 year ago

The ZIP codes dataset is special, and I will provide a GPKG (which is a nicely platform neutral format) download link in a future version of the package. The ZIP codes dataset was tricky to obtain / not illegal, and used with permission, but took me a while to chase down / and it will be a good idea to share.

I would rather not do so with the admin areas, as the "official" files are published by the Cadastral Office at https://vdp.cuzk.cz/vdp/ruian - unfortunately as far as I know the site (it has been recently renovated, and I am not quite up to date with the new version) does not have an English language mutation. But you can download the latest monthly snapshot at this link https://vdp.cuzk.cz/vymenny_format/soucasna/20230831_ST_UKSG.xml.zip - these may be in theory more current than my datasets (as I don't update monthly and they do - but as I use only municipality and higher admin area level these files are pretty static; the main action happens at lower level of detail).

There is also an "unofficially official" plugin for QGIS that connects to the latest RUIAN that can be found (again, sadly only in Czech) at https://github.com/ctu-geoforall-lab/qgis-ruian-plugin - it is capable of much lower level of detail than I am (down to individual buildings level). You may find this of interest as well, if you can overcome the language issue.

do-me commented 1 year ago

Thanks a lot for the background info! Then let me rephrase: maybe you could add the origin of the data to the readme and only where feasible the data themselves. So if anyone needs an updated dataset, they'll just go to the source directly and you don't need to worry about providing outdated info.

Like I mentioned earlier, let's hope that by next year life will become easier and all public data will be accessible in INSPIRE :)