GT collection tool via SageMaker

ipoole commented 10 months ago

It is desirable to collect snow patch GT, independent of the SCL labels, and at 10m resolution. AWS SageMaker appears to have the capability. This issues concerns creation of the tool/environment only. (Other approaches to a GT tool may well be appropriate - suggest they have a separate issue)

ipoole commented 10 months ago

A crude demo has been created on my AWS account. A few .png patches have been uploaded to S3, here.

A SageMaker labelling job has been creaed here.

ipoole commented 10 months ago

Questions and next steps

To move forward to a real SageMaker labellig job, we need to consider

What patches should we prioritise?
What maximum level of snow cover should we accept?
Should we explicitly classify cloud, as an additional label?
Assuming we show the RGB bands as the basis for labelling, how should we set contrast, saturation etc?
What is our guidance to labellers where an area is partially obscured by thin cloud? What level of certainty to we expect?

SimonFisher92 commented 9 months ago

Excellent questions Ian. @murraycutforth please chip in with your replies but here are my thoughts:

1) Ive thought about this a lot, do we take a certain patch (es) for GT, or do we sample across patches. Id be much more comfortable sampling across patches, taking say, 50% of one year for each patch? What do you think?

2) Excellent question, I would say 50+% of snow is not accepted (picking an arbitary number. Of course, having looked at this data in the past, its certainly not uncommon to have 100% cover in May, but it melts quickly

3) I think this doubles the work, for less than double the reward, so no, from me at least

4) Good point, I do not have an opinion on this, but id say, the closest to that which is on the image browser for copernicus?

5) Another excellent point, id say, if they can make it out, and are confident, label it, Sometimes the flyovers are so preciious, it makes sense to try to get the model to predict through thin cloud (we need annotation for this)

SimonFisher92 commented 9 months ago

Disregard my point for 1). we would need to take entire years in order to get enough data for training

SimonFisher92 / Scottish_Snow

GT collection tool via SageMaker #35

Questions and next steps