Earth-Information-System / veda-data-processing

Scripts for data downloading, transformation, and related processing for the VEDA project.
Apache License 2.0
2 stars 8 forks source link

FWI Visualization #11

Open paridhi-parajuli opened 1 year ago

paridhi-parajuli commented 1 year ago
  1. Use Panel parameterized objects (above) to optimize existing FWI dashboards.
  2. Use existing fire detections and plot the corresponding FWI time series and chiclet plots.
    • Read this using geopandas: s3://veda-data-store-staging/EIS/other/feds-output-conus/latest/perim-large.fgb –done
    • Pick a fire from here (just select one row).
    • Draw a 5km buffer around this fire (using Geopandas)
    • Use that buffer as input to FWI time series and chiclet plot.
paridhi-parajuli commented 1 year ago

Feb 17,2023

  1. Panel Parameterized objects created.
  2. Having issues with the file upload functionality. Punt for now. Create a separate static notebook where user enters file name as a variable at the top.
  3. Working to do this without dataframe creation. Will it make it faster? Probably! Keep working on this.
  4. Created buffer polygon around centroid. Good. Let’s try just a buffer around the geometry itself (not the centroid).
  5. Having issues with the distance to lat-lon conversion. This is just a warning. Buffer is in units of the projection – for lat-lon data, the unit is “degrees” which is not a true distance unit. But, e.g., 0.5 degree buffer is fine.
  6. No variable in lis-tws-trend data. This is a DataArray already. Can be accessed with data.values E.g., data[0,0,0:3,0:3].values to retrieve the first few pixels. …but there are issues with the S3 read, not your code. Investigate with Slesa/Iksha/etc. More generally, stacstack produces Xarray DataArrays. They are accessible via data.values, subset via data[...], etc.
  7. Need to work for multipolygons. Simple workaround: .geometry.convex_hull (recall – need to do .geometry.convex_hull.exterior.coords) Ideally, need a better solution – maybe some kind of union
paridhi-parajuli commented 1 year ago

Feb 20,2023

  1. FWI analysis
  1. STAC
paridhi-parajuli commented 1 year ago

2023-02-24

  1. FWI analysis
  1. STAC

For next week

  1. FWI analysis
  1. STAC
paridhi-parajuli commented 1 year ago

For next week

  1. Email Sharonin, Katrina (GSFC-DK000)[Intern] (intern on EIS-Fire) to set up a meeting, discuss ongoing tasks, identify tasks where you can help. Report back to me.

  2. Analysis of ESDIS metrics Alexey is in the process of downloading the data from (closed) ESDIS metrics service. Look for data provided in the doc Several datasets:

    • archive-size-by-product-totals – Total volume of each data product
    • data-products — Additional information on each data product, including science discipline, etc. Useful for merging with other datasets.
    • total-distribution-by-product-2022 — Total data downloads by product and distribution mechanism in 2022. Note that some products have multiple distribution mechanisms.
    • total-distribution-by-user-2022 — Total data downloads by user and product in 2022. Since this contains some mildly sensitive user information (emails), I password protected it – see my Slack message for the password.
    • Some questions to address:
    • Total archive data volume by science discipline
    • Distribution, and cumulative distribution, of data downloads by volume. E.g., How many datasets account for the top 95% of data downloads?
    • What were the top 100 data products distributed in 2022 by volume? By number of unique users? What are the similarities and differences between these top 100 lists – e.g., which products appear in these lists regardless of how you count? Are there any products that are especially popular in terms of number of users but not in terms of data volume? Vice versa?
    • Which providers (DAACs) distributed these datasets? What data formats are these datasets distributed in? What data services are available for these services (this may require some separate browsing of the dataset websites)?
    • What is the distribution of data users? E.g., How many users account for the top 95% of data downloads?
    • Download volume by user discipline?
    • What were the most popular data download mechanisms by volume in 2022?
    • Archive size and distribution volume by product level (Level 1, Level 2, Level 3, etc.).
    • Report all of these results in a Jupyter notebook shared via GitHub.
  3. STAC – once Slesa figures this out, come back to this task.

paridhi-parajuli commented 1 year ago

For next week

  1. Metrics data analysis
  1. New project: Working with HDF-EOS data
paridhi-parajuli commented 1 year ago

Newly added to notebook :

paridhi-parajuli commented 1 year ago

For next week: Modify code to create parquet files using geopandas directly. Create pandas data frame Add the observation timestep (date + time) as a time column to the dataset. Ideally, we want the exact timestep of each pixel…but only if we can find it. Convert to geopandas geodataframe with argument for converting lat/lon columns to geometry (and set CRS to EPSG 4326). Try working with new geoparquet files in geopandas: Read (gpd.read_parquet). Subset by arbitrary polygon – (1) Identify a arbitrary polygon that’s inside the MODIS image; (2) create it as a geopandas / shapely object; (3) crop the MODIS geodataframe to the object from (2). Create parquet files for 3-5 adjacent MODIS tiles. Try reading and subsetting multiple parquet files at once using geopandas. NetCDF analog – xr.openmfdataset(“dat*.nc). Trying to do something similar with Parquet. GOAL: Try to work with 3-5 adjacent MODIS tiles as one continuous dataset. Try doing some basic subsetting of files using Arrow (Reading and Writing the Apache Parquet Format — Apache Arrow v11.0.0). E.g., Try grabbing all pixels with reflectance above a certain value. Look for ways to do spatial subsetting with Arrow. Chat with Denis Tuesday 1pm CT / 2pm ET about new activity.