Clay-foundation / model

The Clay Foundation Model (in development)
https://clay-foundation.github.io/model/
Apache License 2.0
243 stars 25 forks source link

Obtain patch level metadata (e.g. geospatial bounds and cloud cover), save and demo DEP use case (sim search) #172

Closed lillythomas closed 3 months ago

lillythomas commented 4 months ago

This PR will work towards a demonstration of how to obtain patch level embeddings and write them to GeoParquet files to run similarity search with.

Main tasks that need to be done:

Reference tickets: #168 #140

lillythomas commented 4 months ago

The notebook docs/tutorial_digital_earth_pacific_patch_level.ipynb walks through an example of:

@weiji14 @yellowcap ready for when you have time to review.

weiji14 commented 4 months ago

Thanks @lillythomas! I haven't looked too closely yet, but would it be possible to show where the similarity search results are located? Maybe something like showing the bounding boxes of all the patches on a map, and also overlay where the original quarry points are.

lillythomas commented 3 months ago

Thanks @lillythomas! I haven't looked too closely yet, but would it be possible to show where the similarity search results are located? Maybe something like showing the bounding boxes of all the patches on a map, and also overlay where the original quarry points are.

Yes! Great idea. Working on this tomorrow.

lillythomas commented 3 months ago

@weiji14 @yellowcap when you have a moment, this could use another review before merging.

lillythomas commented 3 months ago

Yes @weiji14 @yellowcap I didn't intend to suggest that we would use the derived cloud percentages to filter the embedding results obtained from cloud-free DEP composite, just that the two leveraged some similar things (i.e. AOI) and the initial ask for patch-level metadata https://github.com/Clay-foundation/model/issues/168 was unified so I sought to consolidate.

I guess indeed if we want this to be a sensible consolidation, we could demonstrate this without the cloud-free composite or I can just re-separate the PRs. Whatever is swiftest and most useful. I'll address some of the other comments from Wei Ji first then comment on which of the two directions we go with as far as the above is concerned.

lillythomas commented 3 months ago

Re-separated into different tutorials / PRs. See https://github.com/Clay-foundation/model/pull/184 for the cloud cover tutorial.

yellowcap commented 3 months ago

I was able to run the notebook, works nicely now! Only one celll copmlained, the one with the plot of the similar patches, I get the following error.

ValueError                                Traceback (most recent call last)
Cell In[53], line 41
     39 idx = row["idx"]
     40 # Find the corresponding window based on the idx
---> 41 window_index = idxs_windows["idx"].index(idx)
     42 window_data = idxs_windows["window"][window_index]
     43 # print(window_data.shape)

ValueError: '20180828_4_2' is not in list

I really like the last plot showing the location of the search results. I would have one request there: would it be possible to highlight which one was the input patch? For the user to know what the reference was.

I thought of this because in the plot none of the patches have a ground truth point right in the center. At least the reference chips should be right on top of a quarry ideally.

lillythomas commented 3 months ago

Thanks, glad you could reproduce @yellowcap. As for that error you're encountering, we shouldn't have any dates in 2018 in this tutorial (see daterange: dict = ["2021-01-01T00:00:00Z", "2021-12-31T23:59:59Z"]) so maybe just double check data/minicubes is cleared before running the notebook. Other than that, I'll adjust the plot to support that request; I agree it would be useful.

lillythomas commented 3 months ago

@yellowcap specifying the reference patch in the final plot was achieved in commit https://github.com/Clay-foundation/model/commit/1f9bdf53f3a3811f08aeb7f205f2d1cfd2df0dc1.

At least the reference chips should be right on top of a quarry ideally.

The reference patch overlaps directly with a ground truth point. It is the box in yellow here:

Screen Shot 2024-03-21 at 5 17 41 PM
yellowcap commented 3 months ago

Looks like the linter is not happy about the size of the notebook.

I would propose to replace the cells with images with links that will make the notebook smaller.

I did this by uploading the images from the cells to the PR or Issue, then copy the link and place it in a markdown cell. The following is an example:

![Minicube visualization](https://github.com/Clay-foundation/model/assets/901647/c6e924e5-6ba1-4924-b99a-df8b90731a5f)

See https://github.com/Clay-foundation/model/blob/7bda73110c3297a7d56b3282a885632438afd936/docs/partial-inputs.ipynb#L185