HarvestStat / HarvestStat

HarvestStat - Harmonized Subnational Crop Statistics
MIT License
6 stars 5 forks source link

External Crop Calendar (ECC) table #23

Closed gnodnooh closed 5 months ago

gnodnooh commented 5 months ago

The FDW's crop calendar (start_date, period_date, season_date, etc.) only relies on "season_name" and does not differentiate between fnid, crop_production_system, product, etc., which is unrealistic. The current HarvestStat processing only expands it with "planting_month" and "harvest_month" manually.

The external crop calendar feature allows you to load/apply data from other crop calendar sources, such as Gary's Global Crop Calendar Compilation or the FAO's crop calendar.

Consideration:

gnodnooh commented 5 months ago

We adopted an external crop calendar: https://docs.google.com/spreadsheets/d/1ZBk6KKJtgaVIMktVJibPfmK79m8bSAEvgl7nlY8fDmc/edit#gid=0

Once we make "stack", the below code will create draft of ECC:

stack[['country','season_name','product','crop_production_system','planting_month','harvest_month']].drop_duplicates().sort_values(['country','season_name','product','crop_production_system']).reset_index(drop=True)

Here is a procedure:

  1. Extract a crop calendar table from the "stack" dataframe.
  2. Add/Update (1) to the Global ECC (Google Sheets).
  3. Download (2) and overwrite the local ECC CSV file.
  4. If a year-off issue is found in validation, we correct/download the Global ECC again (steps 2-3).