tdwg / wgsrpd

World Geographical Scheme for Recording Plant Distributions (WGSRPD)
http://www.tdwg.org/standards/109
42 stars 29 forks source link

GeoJSON files and this status of this standard #12

Open rdmpage opened 1 year ago

rdmpage commented 1 year ago

Hi @peterdesmet @PacoPando @johnroxton @HugoGresse @Circeus @MarcRieraDominguez @JeromeMathieuEcology @mdoering @AntoineAA, apologies for the mass pinging but I noticed there's a lot of issues with this standard, and it seems to be orphaned, so nothing is happening.

Any blame for the GeoJSOn problems should be directed to me, I found Kew's GIS files online (they now seem to have disappeared) and did a quick and dirty conversion to GeoJSON using ogr2ogr (see the original pull request https://github.com/tdwg/prior-standards/pull/22). This made them manageable in size, but obviously isn't ideal. I then stuck everything in my fork of tdwg/prior-standards. I wanted to preserve the Kew GIS files (so many online resources keep vanishing), and have a simple visualisation of the reg ions using Github's then new support for GeoJSON.

I guess it's unclear what the standard actually is - is it the book(s), the GIS files, the GeoJSON? It's also clear that the existing GeoJSON doesn't work for everyone.

Perhaps someone at Kew would step up and make this standard usable?

If Kew don't want to take on this project, there is a MySQL version of the data at https://github.com/RBGKew/powop/tree/production/powo-geodb/data (part of a repository with a AGPL-3.0 license license, so that might be another place to start (I don't know what the level of detail is for the spatial data in that data, but it's being used for POWO.

Perhaps what we could do is generate fresh GeoJSON files, broken into separate files for each geographic regions within Level 1, Level 2, etc. Each file gets some scrutiny (say on GitHub), people can tweak as necessary, then we bring them all together as "official" TDWG files for these areas. Perhaps put them in Zenodo with a DOI so they are persistent?

Of course, this leaves out who "we" are. I am not a GIS person (you may have guessed that by now), nor do I use R (one of the tools that struggled with the current GeoJSON files), so ideally those who actually need these files could be willing to contribute?

rdmpage commented 1 year ago

Examples of the GeoJSON files in the wild:

johnroxton commented 1 year ago

Dear Roderic et al.,

thanks for picking this up. I don't know exactly why I am on this list, I believe it's because of a comment I left, but I would be willing to help. However, I am not sure what needs to be achieved. I am using R and Python and also found the level of detail did not help with dynamic display, but reducing resolution seemed to be no problem. Anyway, shouldn't these files be managed by TDWG?

Best

David (aka John Roxton)

Am 09.08.2023 um 13:31 schrieb Roderic Page:

Hi @peterdesmet https://github.com/peterdesmet @PacoPando https://github.com/PacoPando @johnroxton https://github.com/johnroxton @HugoGresse https://github.com/HugoGresse @Circeus https://github.com/Circeus @MarcRieraDominguez https://github.com/MarcRieraDominguez @JeromeMathieuEcology https://github.com/JeromeMathieuEcology @mdoering https://github.com/mdoering @AntoineAA https://github.com/AntoineAA, apologies for the mass pinging but I noticed there's a lot of issues with this standard, and it seems to be orphaned, so nothing is happening.

Any blame for the GeoJSOn problems should be directed to me, I found Kew's GIS files online (they now seem to have disappeared) and did a quick and dirty conversion to GeoJSON using |ogr2ogr| (see the original pull request tdwg/prior-standards#22 https://github.com/tdwg/prior-standards/pull/22). This made them manageable in size, but obviously isn't ideal. I then stuck everything in my fork of tdwg/prior-standards https://github.com/tdwg/prior-standards. I wanted to preserve the Kew GIS files (so many online resources keep vanishing), and have a simple visualisation of the reg ions using Github's then new support for GeoJSON.

I guess it's unclear what the standard actually is - is it the book(s), the GIS files, the GeoJSON? It's also clear that the existing GeoJSON doesn't work for everyone.

Perhaps someone at Kew would step up and make this standard usable?

If Kew don't want to take on this project, there is a MySQL version of the data at https://github.com/RBGKew/powop/tree/production/powo-geodb/data https://github.com/RBGKew/powop/tree/production/powo-geodb/data (part of a repository with a AGPL-3.0 license https://github.com/RBGKew/powop/blob/production/COPYING license, so that might be another place to start (I don't know what the level of detail is for the spatial data in that data, but it's being used for POWO https://powo.science.kew.org.

Perhaps what we could do is generate fresh GeoJSON files, broken into separate files for each geographic regions within Level 1, Level 2, etc. Each file gets some scrutiny (say on GitHub), people can tweak as necessary, then we bring them all together as "official" TDWG files for these areas. Perhaps put them in Zenodo with a DOI so they are persistent?

Of course, this leaves out who "we" are. I am not a GIS person (you may have guessed that by now), nor do I use R (one of the tools that struggled with the current GeoJSON files), so ideally those who actually need these files could be willing to contribute?

— Reply to this email directly, view it on GitHub https://github.com/tdwg/wgsrpd/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF4GZIYJLLP7GGCVNMLEBBDXUNYIDANCNFSM6AAAAAA3J3LPCQ. You are receiving this because you were mentioned.Message ID: @.***>

stanblum commented 1 year ago

Hi Rod, et. al., Yes, this standard was left semi-abandoned for quite a while. But it has been picked up again in a joint effort with a group interested in creating a comparable standard for marine regions. The new interest group covers both efforts and is called GeoSchemes.

rdmpage commented 1 year ago

@stanblum Good to know. Might be useful to put a big banner at the top of this repository directing people to GeoSchemes.

peterdesmet commented 1 year ago

Indeed (and good to know)! @stanblum can you provide some text, I can easily add it to the README as a banner.

AntoineAA commented 1 year ago

Hi,

Thanks for the news.

I pushed a pull request a few months ago. It doesn't improve the geo data but it fixes the GeoJSON data so that they are valid (because we actually use them in Pl@ntNet). https://github.com/tdwg/wgsrpd/pull/10

A geomatician on our team is currently finalizing a much more accurate and cleaner version of the polygons. I sent him the link to this discussion so that he could explain it more precisely than I could.

Kind regards,

Antoine

rdmpage commented 1 year ago

@AntoineAA Thanks for responding, improved GeoJSON would be great, hopefully members of Geoschemes will be following this discussion.

mrjohnc commented 1 year ago

Hi @rdmpage I would really like to add the geoshapes to the corresponding Wikidata items so I can mirror POWO maps on Wikipedia articles for all plants like I started with https://en.wikipedia.org/wiki/Asparagus_horridus . However this would require GeoJSON files for each geographic regions which is way way beyond my technical ability. Is there any chance this will happen?

Thanks very much

rdmpage commented 1 year ago

@mrjohnc We have GeoJSON files for levels 1 to 4, see https://github.com/tdwg/wgsrpd/tree/master/geojson but these are files for all regions combined.

Are you looking for files for each separate region? As far as I know POWO uses Level 3 region codes, so I could generate one GeoJSON file per Level 3 region, if that would be useful. You would need to upload those to Wikimedia Commons, and associate them with Wikidata items, but it would enable you to have maps. If TDWG get's its act together and refines the GeoJSON then I guess it would just be a case of updating the Commons files.

Anyway, I guess I'm offering to generate GeoJSON files, one per Level 3 region, if that's what you need to add maps to plants.

mrjohnc commented 1 year ago

@rdmpage thanks very much for your reply and thank you so much for offering to generate them.

I don't know what level POWO uses, I've certainly seen level 4 regions as well eg individual states in the US https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:275898-2 Its possible it might use the 2 regions as well I guess but level 1 does seem very unlikely.

I've already mapped all the regions to Wikidata in OpenRefine and can create any missing regions and yes then upload the regions to Commons and then link them to the Wikidata items.

One other issue might be the copyright stuff, do you know if there is a license attached to these areas? Hopefully something copatible with Commons like CC-BY or even CC0

Honestly if you could generate level 2, 3 and 4 and put them somewhere with a very clear license that would certainly allow me to generate all the maps. I'm not sure how much work this would be in practice so I don't know if what I'm asking is a reasonable ask....

Thanks again

johnroxton commented 1 year ago

Hi all,

great that you do the mapping for the species in wikipedia! As far as I see, the license for all TDWG standards, including the regions, is CC BY 4.0 Deed (see bottom line at https://www.tdwg.org/standards/)

I just wanted to mention that you need to take care with the following typo in the TDWG files to get the maps for Argentine right:

The level 4 code for Córdoba Province, Argentina is given as "AGE-CO", but should be "AGE-CD"

Good luck!

Am 23.10.2023 um 19:13 schrieb John Cummings:

@rdmpage https://github.com/rdmpage thanks very much for your reply and thank you so much for offering to generate them.

I don't know what level POWO uses, I've certainly seen level 4 regions as well eg individual states in the US https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:275898-2 https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:275898-2 Its possible it might use the 2 regions as well I guess but level 1 does seem very unlikely.

I've already mapped all the regions to Wikidata in OpenRefine and can create any missing regions and yes then upload the regions to Commons and then link them to the Wikidata items.

One other issue might be the copyright stuff, do you know if there is a license attached to these areas? Hopefully something copatible with Commons like CC-BY or even CC0

Honestly if you could generate level 2, 3 and 4 and put them somewhere with a very clear license that would certainly allow me to generate all the maps. I'm not sure how much work this would be in practice so I don't know if what I'm asking is a reasonable ask....

Thanks again

— Reply to this email directly, view it on GitHub https://github.com/tdwg/wgsrpd/issues/12#issuecomment-1775649089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF4GZI43OCJO4W6U4Z23XMTYA2QTTAVCNFSM6AAAAAA3J3LPCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZVGY2DSMBYHE. You are receiving this because you were mentioned.Message ID: @.***>

mrjohnc commented 1 year ago

Amazing, thanks so much :)

rdmpage commented 1 year ago

@mrjohnc @johnroxton OK I've created a set of GeoJSON files for the regions, levels 1-4, with each region in a separate file. The repo is rdmpage/wgsrpd-geojson, you can see an example file for Mali here https://github.com/rdmpage/wgsrpd-geojson/blob/main/level3/MLI-Mali.json Screenshot 2023-10-25 at 13 12 42

I'm hoping this will work with Wikicommons and they can be added to Wikidata using P3896 (I just put this link here to remind me how to do it).

The source for the data is a MySQL dump in a Kew Gardens repository which is under a AGPL 3.0 license, which AFAIK is compatible with Wikipedia, see https://commons.wikimedia.org/wiki/Category:AGPL

Hope these will be useful.

mrjohnc commented 1 year ago

Wonderful, thanks so much :) One request, can you put an explicit license statement somewhere on your GitHub? I don't want to get into an argument on Commons about that the licence might be different because they're published by you not Kew or something

rdmpage commented 1 year ago

The repository is AGPL 3, this should be visible when you visit the repo and there is also a LICENSE file to that effect.

Sent from Outlook for iOShttps://aka.ms/o0ukef


From: John Cummings @.> Sent: Thursday, October 26, 2023 8:11:36 PM To: tdwg/wgsrpd @.> Cc: Roderic Page @.>; Mention @.> Subject: Re: [tdwg/wgsrpd] GeoJSON files and this status of this standard (Issue #12)

Wonderful, thanks so much :) One request, can you put an explicit license statement somewhere on your GitHub? I don't want to get into an argument on Commons about that the licence might be different because they're published by you not Kew or something

— Reply to this email directly, view it on GitHubhttps://github.com/tdwg/wgsrpd/issues/12#issuecomment-1781749722, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAAUK2UNDRM66Z4ZIWECNKDYBKYWRAVCNFSM6AAAAAA3J3LPCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBRG42DSNZSGI. You are receiving this because you were mentioned.Message ID: @.***>

mrjohnc commented 1 year ago

Perfect, thanks very much