SimonFisher92 / Scottish_Snow

2 stars 0 forks source link

GT collection tool via SageMaker #35

Open ipoole opened 6 months ago

ipoole commented 6 months ago

It is desirable to collect snow patch GT, independent of the SCL labels, and at 10m resolution. AWS SageMaker appears to have the capability. This issues concerns creation of the tool/environment only. (Other approaches to a GT tool may well be appropriate - suggest they have a separate issue)

ipoole commented 6 months ago

A crude demo has been created on my AWS account. A few .png patches have been uploaded to S3, here.

A SageMaker labelling job has been creaed here.

ipoole commented 6 months ago

Questions and next steps

To move forward to a real SageMaker labellig job, we need to consider

SimonFisher92 commented 5 months ago

Excellent questions Ian. @murraycutforth please chip in with your replies but here are my thoughts:

1) Ive thought about this a lot, do we take a certain patch (es) for GT, or do we sample across patches. Id be much more comfortable sampling across patches, taking say, 50% of one year for each patch? What do you think?

2) Excellent question, I would say 50+% of snow is not accepted (picking an arbitary number. Of course, having looked at this data in the past, its certainly not uncommon to have 100% cover in May, but it melts quickly

3) I think this doubles the work, for less than double the reward, so no, from me at least

4) Good point, I do not have an opinion on this, but id say, the closest to that which is on the image browser for copernicus?

5) Another excellent point, id say, if they can make it out, and are confident, label it, Sometimes the flyovers are so preciious, it makes sense to try to get the model to predict through thin cloud (we need annotation for this)

SimonFisher92 commented 5 months ago

Disregard my point for 1). we would need to take entire years in order to get enough data for training