ghiggi / gpm_api

Global Precipitation Measurement Mission (GPM) python package to download and analyze data with xarray
https://gpm-api.readthedocs.io
MIT License
57 stars 6 forks source link

How to correctly retrieve IMERG products #58

Open edsml-ds423 opened 3 months ago

edsml-ds423 commented 3 months ago

Is there an existing issue for this?

Current Behavior

Hi,

First of all, I want to just say great job on the gpm-api. I was previously getting IMERG V07B data via FTPS in Python but moving forward, I might migrate and use your API.

Before doing so I wanted to just sense check the downloaded data. I could be missing something but I did a test on the timeslice 27th September 20:00-20:30. To do so, I downoloaded the file via your api and I downloaded the file directly from https://arthurhouhttps.pps.eosdis.nasa.gov/gpmdata/2022/09/27/imerg/. I then checked the data.

I found a mismatch between the values and if I plot a cropped region containing Hurricane Ian (peak intensity at the chosen timeslice), you can visually see the difference.

The two files that I am comparing are:

Can I ask what server are you getting the version 7 files from?

Thanks, Dan

Expected Behavior

I was expecting the API and directly downloaded data to match exactly.

Steps To Reproduce

  1. Download data using the GPM-API tutorial notebook using the following dates:
start_time = datetime.datetime.strptime("2022-09-27 00:00:00", "%Y-%m-%d %H:%M:%S")
end_time = datetime.datetime.strptime("2022-09-28 00:00:00", "%Y-%m-%d %H:%M:%S")
  1. Download the data directly from https://arthurhouhttps.pps.eosdis.nasa.gov/gpmdata/2022/09/27/imerg/
  2. Compare the results either using np.allclose() or plot.

Environment

- OS:
- python:

Anything else?

No response

ghiggi commented 3 months ago

Hi @edsml-ds42. Sorry for the slight delay in answering, I just got back from holidays.

I think you are looking at different products:

To get the file you directly downloaded, you need to specify product="IMERG-FR", product_type = "RS". To download IMERG Early Run, you need to specify product="IMERG-ER", product_type = "NRT". Note that you can download data from either storage="PPS" or storage="GES_DISC".

Currently, I can already tell you that if you try downloading IMERG from GES_DISC you will likely download duplicates files (one for version V07A and V07B). This might cause problems in gpm.open_dataset. This problem occurs because they are updating IMERG products to version V07B these days .... then I guess they will delete version V07A and the problem will solve.

edsml-ds423 commented 3 months ago

Hi @ghiggi,

No worries at all!

Thanks for pointing that out, all makes sense. I want the Early Run and I am currently using: product="IMERG-ER", product_type = "NRT", storage="PPS" so should be good.

ghiggi commented 3 months ago

Glad I could help @edsml-ds423 ;) Have a nice day !