Calfornia embeddings description

I’m a bit lost on #7:

What input images are used? The usual Sentinels, from what dates? Are these chips saved somewhere?
From the code I gather that the filename are named as worldcover... but these are not the inputs... We don’t input landcovers, we input sentinel data. It also uses rows and columns, but I don’t know how to geolocate that.
The only way to geolocate the extend of a chip embedding is to cross reference the random chip_id in the folder embeddings_v0.2 with the .geoparquet california-worldcover-chips-osm-multilabels.parquet which has both chip_id , [col- row to check], and geometry, which should in most cases be a rectangle.
For patches within the 3 dimensional patch_embeddings_v0.2/, we need to unroll the image into the n patches, but unclear how to do that ensuring the right order, so we can calculate the bbox of each patch.

Plotting the california-worldcover-chips-osm-multilabels.parquet I can see that indeed these are the bbox

I do see holes, places within California but without a california-worldcover-chips-osm-multilabels.parquet coverage. Are these invalid inputs (due to clouds and other errors)/
I assume these are in all cases the average across all bands and band groups.

Thanks!!

cc @yellowcap

Clay-foundation / earth-text