Open mosoriob opened 3 years ago
@mosoriob I have a very hacky script to download gpm data; it basically generates a curl
commands for each file that needs to be downloaded and writes them into a bash script file. Since this was a (supposedly) a one-time effort, it should probably be rewritten in a more maintainable manner. In the meantime, let me know if this will suffice for now?
import datetime
earthdata_username = 'read from ENV variable'
earthdata_password = 'read from ENV variable'
def generate_download_links_for_date(input_date, download_dir):
day_commands = []
day_of_year = input_date.strftime("%j")
year = input_date.strftime("%Y")
date_str = input_date.strftime("%Y%m%d")
start = datetime.datetime(input_date.year, input_date.month, input_date.day, 0, 0, 0)
num_thirty_min_intervals = 24 * 2
for i in range(num_thirty_min_intervals):
interval_start = start + datetime.timedelta(minutes=30*i)
interval_end = start + datetime.timedelta(minutes=30*(i+1)) - datetime.timedelta(seconds=1)
interval_start_str = interval_start.strftime("%H%M%S")
interval_end_str = interval_end.strftime("%H%M%S")
minutes_str = str(30*i).zfill(4)
url_prefix = f"https://gpm1.gesdisc.eosdis.nasa.gov/opendap/hyrax/GPM_L3/GPM_3IMERGHHE.06/{year}/{day_of_year}"
filename = f"3B-HHR-E.MS.MRG.3IMERG.{date_str}-S{interval_start_str}-E{interval_end_str}.{minutes_str}.V06B.HDF5.nc4"
download_url = f"{url_prefix}/{filename}"
download_target = f"{download_dir}/{day_of_year}/{filename}"
curl_command = f"curl -n -c ~/.urs_cookies -b ~/.urs_cookies -L --url {download_url} --create-dirs -o {download_target}"
day_commands.append(curl_command)
return day_commands
commands = []
date_start = datetime.datetime.strptime("2014-08-01", "%Y-%m-%d")
date_end = datetime.datetime.strptime("2014-09-01", "%Y-%m-%d")
arya_download_dir = f"/data/mint/gpm_{date_start.strftime('%Y%m%d')}_{date_end.strftime('%Y%m%d')}"
delta_days = (date_end - date_start).days
for i in range(delta_days+1):
cur_date = date_start + datetime.timedelta(days=i)
commands += generate_download_links_for_date(cur_date, arya_download_dir)
netrc_string = f"machine urs.earthdata.nasa.gov login {earthdata_username} password {earthdata_password}"
with open("download_gpm.sh", "w") as f:
f.write("#!/bin/bash\n")
f.write(f'''rm -f .netrc && touch .netrc && echo "{netrc_string}" >> .netrc && chmod 0600 .netrc''' + "\n")
f.write('''rm -f .urs_cookies && touch .urs_cookies''' + "\n")
f.write("\n".join(commands))
This is the 30min one? So the one that takes quite a bit of time, correct?
we also need CHIRPS
Endpoint: https://data.chc.ucsb.edu/products/CHIRPS-2.0/africa_6-hourly/
Do we have a script for that one as well?
^ Yeah, it's the 30 min one and yeah, it usually takes a bit of time to download (~10-30s per file)
And we don't have download scripts for CHIRPS; UCSB folks were pushing data to the data catalog directly.
Since they left the program, I'm assuming they are no longer pushing anything?
yeah, it doesn't look like there has been any new activity for over a year
We need to download the GPM Data