Closed weiji14 closed 5 months ago
Good idea, but not sure if this is still relevant. We won't always have a way to assess cloud cover in this predict step. @srmsoumya is this code still operational?
We don't have a predict_step
in v1, we should add a script instead, that takes maybe a tile as input & creates embedding for chip size defined by the user.
This can be scalable & we could add AWS batch scripts for these.
ok in that case closing here, let's revisit when doing prediction scripts
Idea that came up during our regular meetings, on generalizing the cloud-cover percentage patch-level info (i.e. extending #168) to other bands/channels, so that someone could apply other filters based on certain columns with some statistics (mean, count, min/max, percentage, etc) derived from the input images. This would enable pre-filtering based on attributes when performing Similiarity Search.
Example:
This would involve generalizing the inference part of the code somehow, specifically the
predict_step
function here:https://github.com/Clay-foundation/model/blob/0145e55bcf6bd3e9b19f5c07819a1398b6a22c35/src/model_clay.py#L855-L921
Some changes might also need to happen on the DataLoader side, so that these statistical measures are passed through. Parking this as an idea for now.