CivicDataLab / IDS-DRR-Assam

Intelligent Data Solution - Disaster Risk Reduction is a system to assist flood management in the state of Assam through data-driven ways. The repository contains codes to extract relevant datasets and the modelling approach used to calculate Risk Scores for each revenue circle in Assam.
GNU Affero General Public License v3.0
0 stars 0 forks source link

IDS-DRR Data Storage | MASTER DATA INPUT #14

Closed d-saikrishna closed 7 months ago

d-saikrishna commented 1 year ago

@ArcD7 @ruthvik129

We will be using S3 for data storage. The following structure will be used. IDS-DRR S3 Structure-1.pdf

Few decisions to be taken:

  1. Should COG format be used? It will increase the size of the rasters but optimise realtime access of rasters for users.
  2. What should be the resolution of the satellite images? As available or should we resample?
d-saikrishna commented 1 year ago

@manjunathhegdebalgar can we use this space to create a upload_to_s3 template for the entire project?

d-saikrishna commented 1 year ago

The master datasheet for model input is here 🙂 https://github.com/CivicDataLab/IDS-DRR-Data-Sources/blob/main/MASTER_VARIABLES.csv

PS: I did not do any missing value imputation in this sheet. Let's decide on imputation for each column (for tenders, missing value could equal 0; mean for others etc)

PS2: River water level, 'Crop area affected` is not in this year. Meenu Francis to suggest how you want data for river water level at monthly level. Waiting for dataset from Nobo for Crop area affected

d-saikrishna commented 1 year ago

Inputs from Meenu about missing data imputation:

1. I am thinking for govt response, wherever data is missing, it should be taken 0.
2. Damages data also I think should be treated the same way
3. Regarding ndvi, ndbi - can we do some interpolation or adopting averages
4. what does "mean_cn" represent?
5. could you include (i) landuse characteristics (ii) area of rc (iii) seasonal and permanent water (iv) inundation variables (v) river waterlevel stats too?
6. I think seasonal and permanent water sources also lack information in some cells. We will need to take a call on what to use here.
d-saikrishna commented 12 months ago

Updates;

  1. Added Landuse, Permanent_water, distance to river, Seasonal_water and area of rc variables to the master datasheet.

  2. surface runoff, drainage density to be added to the master sheet

  3. Missing data imputation is not done for following variables yet: antyodaya variables; inundation variables; river_level; landuse variables; seasonal and permanent water area;

d-saikrishna commented 12 months ago

Missing data imputation

Source ImputationMethod
BHUVAN 0
SENTINEL Take value of the RC from the previous month
TENDERS 0
FRIMS 0
FFS 0
LANDUSE 0
ANTYODAYA We can take the average values of the districts in which these RCs exist to fill the missing data
d-saikrishna commented 12 months ago

Updated with Crop area and infra damages data shared by FRIMS [Last date - Aug312023]

d-saikrishna commented 9 months ago

New mode - master data will change

d-saikrishna commented 7 months ago

Master data CSV is figured out with data dictionary.