Open davemlz opened 2 years ago
Hi, @aazuspan!
I was super curious and I just created the repository (https://github.com/davemlz/ee-land-cover-datasets). The repository updates itself every day at midnight and here is the JSON file with the complete list of datasets, bands, descriptions and colors! :)
For creating a Dataset class from all of them, one of the solutions that I usually do is to inherit from a Box
object. Look at these examples in spyndex
: https://github.com/davemlz/spyndex/blob/main/spyndex/axioms.py. In that way, we just would have to read the JSON file, loop through it creating a Dataset class for each dataset/band and then creating the datasets
object using the Box
properties!
If you think this can fit in sankee
, let me know!
Cheers,
Dave
This is incredible @davemlz! You're a STAC wizard! I'm embarrassed to admit I've just been manually entering everything, so having an automated solution to building datasets would be amazing :exploding_head:
For accessing your JSON, would that be done live (e.g. the JSON is downloaded on import or through a load_datasets
function)? Or does sankee
cache a local copy that can be updated? Just curious what that implementation would look like.
Anyways, you're totally welcome to work on this however you'd like! Obviously building ee-land-cover-datasets
is already a huge contribution, but if you feel like making a PR to integrate that into sankee
that would be great!
Thanks!
One other quick thought--we may need to come up with a good way to convert the class descriptions into shorter class names since some of them are pretty wordy, like Mixed Broadleaf/Needleleaf Forests: co-dominated (40-60%) by broadleaf deciduous and evergreen needleleaf tree (>2m) types. Tree cover >60%.
from the MODIS dataset or Cultivated and managed vegetation / agriculture. Lands covered with temporary crops followed by harvest and a bare soil period (e.g., single and multiple cropping systems). Note that perennial woody crops will be classified as the appropriate forest or shrub land cover type.
from the CGLS dataset.
Maybe we can get by with just having a list of common delimiters between the name and definition (:
and .
from the examples above) and splitting the descriptions with those, but ensuring they work well across all datasets might be tricky...
Hi, @aazuspan!
Haha don't worry. When I started eemont
I also did it manual at first! :)
For the JSON access: Both can be. I mean, we can get sankee
to cache the JSON for an automatic local use and we can also add an online: bool = False
arg to let people get the most updated version that is inside ee-land-cover-datasets
. I do it with spectralIndices()
:)
For the descriptions: Yes, you're right! I've also seen the long names and that idea of the delimiters is good! I think it can be done, it might take a while, but it is possible! :D
PD: I'm more than happy to work on it! :) I'll start working!
Cheers,
Dave
Awesome, that sounds like a good solution!
Let me know if you have any questions or if there's anything that needs to be added or changed in sankee
to make this work :)
Hi, @aazuspan!
I was using
sankee
and it is amazing!I was thinking that maybe it could be possible to add all datasets in the GEE Catalog that are Land Cover Classifications, or at least discrete products. Here is the idea!
If we take the Copernicus CORINE Land Cover Product as an example and get the STAC info, we can see that in the
summaries
property there is aeo:bands
property. This property describes all the bands of the collection. If the band is a discrete band, it will have agee:classes
property explaining thecolor
, thedescription
and thevalue
!Check it now for MODIS:
So, the idea is to create a repository like this one that I use to keep all the scale and offset parameters used for the
scaleAndOffset()
method inee_extra
:eemont
updated. But, in this case, it would be a repository where we would store thecollection
, theband
and thevalues
,colors
anddescriptions
just as you need them for theDataset
class! Then,sankee
can just grab the data from this repository to keep all datasets updated :)Let me know what you think, and, if you want, I can work on creating that repository and linking it to
sankee
!Cheers,
Dave