ai4er-cdt / WildfireDistribution

AI4EO GTC 2021/2. Private repository for group 1: determining wildfire distribution in visible remote sensing imagery.
MIT License
6 stars 0 forks source link

Final week task list #27

Open sofstef opened 2 years ago

sofstef commented 2 years ago

Here's the list of tasks left to do by Friday (a few could be left for later). I suggest everyone writes which ones they're taking on and then I'll add a name next to the task and you can tick it when done. Use the comments to extend the list as needed and I can keep editing this issue.

Model training

Model evaluation

FAIR tasks:

Repository organising

Report writing

graceebc9 commented 2 years ago

I'll note down some next steps for the reports here as I go

Figures for the appendix/ data sources: @ThomasDodd97 please could you add a legend to the landcover plot you created with the mosaic and save that image to the github under report / figures?

UPDATE - ive pulled a figure from the classification report from the landcover with the classes so no need to create this :)

graceebc9 commented 2 years ago

@Hamish-Cam - please could you generate one of the modis fire map plots without the blue dotted box? I've deleted the folder with modis data so cant re run your script.

Hamish-Cam commented 2 years ago

Model Training

  1. num_workers effects how quickly the CPU loads the data into the GPU => its value only effects the speed/efficiency of the GPU: https://chtalhaanwar.medium.com/pytorch-num-workers-a-tip-for-speedy-training-ed127d825db7. Given that the threading error seen is in the rasterio library and so is by no means (at least easily) fixable, I suggest we continue to take a speed hit, given that it means we can run bug free.
  2. Martin agrees that for now burn_prop=1 for all cases. Since this isn't necessarily a long term solution, I won't alter the code so that we have the flexibility to change it if we wish.
  3. Similarly, we are interested in the ability of our model to predict fires, not lack of fires. Therefore, I believe we should be using the same balanced sampler for val/test as for training (Martin agrees). As such I have pushed changes to remove the grid_sampler option (which was causing issues anyway) and instead use the constrained sampler for val/test when balance_samples = True.
Hamish-Cam commented 2 years ago

Martin has also suggested trying to overfit our model to one/two samples (by repeating training on these) to see if it can train to predict these fires.

He has also suggested we try a 'class balance cross entropy loss function' which would penalise the non-prediction of a fire more than the prediction of one when there is no fire. This may help with our model just predicting no fires.

sofstef commented 2 years ago

@Hamish-Cam re choice of loss: we are currently using jaccard loss which is suited for tackling our type of problem with very few pixels having fire. I have also added an option to use focal tversky loss, which is essentially a generalisation of Jaccard and allows for choosing weight parameters to penalise false negatives more + has an additional parameter which can be used to force the network to focus on pixels where it's struggling to make predictions. Will try the tversky loss out now and push the code so you can use it in colab!

Hamish-Cam commented 2 years ago

@Hamish-Cam re choice of loss: we are currently using jaccard loss which is suited for tackling our type of problem with very few pixels having fire. I have also added an option to use focal tversky loss, which is essentially a generalisation of Jaccard and allows for choosing weight parameters to penalise false negatives more + has an additional parameter which can be used to force the network to focus on pixels where it's struggling to make predictions. Will try the tversky loss out now and push the code so you can use it in colab!

Brill, sounds like this is pretty well covered then. Trying to overfit might still be a good test to run. Thanks!

graceebc9 commented 2 years ago

Proposed for who writes what:

Intro / lit review: Thomas Datasets - grace Methodology - Hamish or Sofija Results - Hamish or Sofija Conclusion & further - Grace

graceebc9 commented 2 years ago

had an issue with training on sentinel - note that 'his seems related to the following bug reports. Basically, the UNet that comes with SMP requires images with patch_size divisible by 32. Can you try switching from 250 to 256 and see if that solves your issue? -- switching to 256 solved the issue.