JuliaGeo / GADM.jl

A Julia package for obtaining geographical data from the GADM dataset
https://gadm.org
MIT License
38 stars 4 forks source link

Add depth parameter #46

Closed ClaroHenrique closed 2 years ago

ClaroHenrique commented 2 years ago

This PR aims to add the depth parameter in the GADM.get function. This feature is mentioned in issue 40.

juliohm commented 2 years ago

Thank you @ClaroHenrique for starting this PR ❤️ We need to replace the option children by the option level. When level=0 that is equivalent to children=false, when level=1 that is equivalent to children=true and when level=2,3,... that is a new feature with a single table containing all subtables merged into a single one.

Please let me know if something is not clear. I can provide a simple example with states and cities in Brazil.

ClaroHenrique commented 2 years ago

Hi @juliohm, I think i get it now. My main question is about how can we join the subtables in a single one. In the original database, the data is separated by level in tables with different columns.

Schema example ![image](https://user-images.githubusercontent.com/38709777/172072608-3e536f13-38c8-417e-88c6-c411b33b7eee.png)

Is there a problem if the resulting single table hold all those columns?

juliohm commented 2 years ago

Hi @ClaroHenrique can you provide an example of result with Brazil data? What are the tables we get when we set level=0,1,2?

We can then try to figure out a combination of columns that makes sense in general.

ClaroHenrique commented 2 years ago

@juliohm, those are the results from the database downloaded in https://data.biogeo.ucdavis.edu/data/gadm3.6/gpkg/gadm36_BRA_gpkg.zip.

Level 0 ![image](https://user-images.githubusercontent.com/38709777/172073199-79555db4-05ab-41b5-92dd-01625f28bfd9.png)
Level 1 ![image](https://user-images.githubusercontent.com/38709777/172073242-11879150-b4e1-45d6-a80a-2d91041c64e5.png)
Level 2 ![image](https://user-images.githubusercontent.com/38709777/172073285-27c52565-d5aa-481c-a1c1-3ec517fa3814.png)
juliohm commented 2 years ago

Nice, so the idea is that if a user types

get("BRA", "Alagoas", level=1)

we return all the cities of Alagoas in a single table. This consists of filtering the rows of the table of level 2 that have Alagoas as the state. More generally, we want

get(country, state, city, municipality, ..., level=n)

to return all the leaves of the node country->state->city->municipality that are n steps deeper in the tree.

This could be implemented with different algorithms. We could use a depth-first-search starting from the municipality to find the children at depth n. We then push these tables to a list of tables to be merged. At the end all these tables will be at the same level, so we should expect them to have the same columns. If they don't have the same columns, we should probably investigate the issue further.

I can take a closer look at the dataset if it is still not clear what the end goal is.

ClaroHenrique commented 2 years ago

It's fine. I just missunderstood this message. For a moment I thought that we were going to process multiple levels in one get.