NASA-IMPACT / pixel-detector

pixel detector using shapefiles for generating truth set.
4 stars 0 forks source link

Story: Improve model by improving data collection. #6

Open muthukumaranR opened 5 years ago

muthukumaranR commented 5 years ago

To improve Model, making data collection faster is imperative. The bottleneck in training data generation pipeline is the rasterization of files. The current approach follows the following steps:

  1. raster full disk image using QGIS (15 - 20 mins)
  2. Load the shapefiles from HMS database onto QGIS
  3. locate where the plumes are coming from
  4. Draw a shapefile or Modify the existing Shapefile
  5. export the shapefile.

In this method, using qgis software is not optimal as it cannot process a fulldisk raw file in-memory. So, every scroll through the data in the software may potentially make the software reload the data from the disk. this may slow down the labelling process.

To overcome this:

  1. We can use the info from NOAA shapefiles to narrow down approximate extent of a smoke plume.
  2. We use those co-ordinates to crop and warp (using gdal calls) a smaller section of raw file to create a GeoTiff Raster and store it on disk. (BAND 1 and BAND 3 will be used for this)
  3. Re draw Shapefiles and build dataset using the generated GeoTiff in QGIS. this should be much faster than loading a raw fulldisk.
  4. Re-train the model
muthukumaranR commented 5 years ago

Issues Referenced: