Open raspstephan opened 5 years ago
I see a benefit of using infrared data in the following cases:
Identification of categorically wrong labels The infrared images can give us an information about the CTH. Since we are interested in shallow convection only and all our patterns shouldn't reach higher than say 7 km, we could exclude labels that contain substantial amount of high clouds. To be on the save side I would actually exclude all images with high clouds. These IR images come with no cost for us, as they are captured by MODIS as well and can be downloaded easily from worldview with the download script. ( I actually did some work on this already...I'll push the branch later) If downloaded with the same resolution, the labels can be just used in the same way.
Expanding our scope Humans are used to identify clouds as white patches in the sky and so labelling cloud patterns on satellite images is much easier for them in the visible channel. However, I would argue that for our case to detect shallow clouds (and exclude high ones), the infrared channel is the better option as long as the resolution is sufficient. By using the brightness temperature to train the model my hope would be to use the trained model directly on a different dataset to expand our scope. Currently I have two in mind:
But also on the MODIS dataset it would have the advantage that we could do classifications at night time where the visible channel is black, which could strengthen an argument about the annual cycle
Note on the resolution The resolution of the visible data of MODIS is up to 250m, while the infrared channels has just 1km. However, since we downloaded the images with roughly 100px/1deg we had only a resolution of about 1km and that was sufficient to distinguish between these patterns. In addition, if we think about the smaller features like sugar, one could also imagine, that the bigger surrounding clusters characterise it rather than the small clouds itself. Short: I'm optimistic that IR data should work as well
I just realised over the weekend, that the MODIS dataset goes also back to 2000(Terra)/2002(Aqua). So there is also some room for expansion in VIS and IR images.
@observingClouds You mentioned that we could also use a different infrared dataset to train the ML model on. Which dataset were you thinking of? How long are these data available for? Several decades? Could we do a climate change study with that?