blaylockbk / Herbie

Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the University of Utah Pando Archive System.
https://herbie.readthedocs.io/
MIT License
424 stars 70 forks source link

No negative longitudes using FastHerbie and GEFS #292

Open matt-h2o opened 3 months ago

matt-h2o commented 3 months ago

Hi - I've been having issues using FastHerbie with GEFS and extracting data for specific locations. Nearest Point fails, so I've just been using .sel method='nearest', which seems to work so long as the longitude isn't negative - for locations in the western hemisphere (Madrid and Lisbon in the example below), it just gives me a longitude of 0. The code seems to work fine otherwise. Any thoughts on what might be going wrong?

thanks! Matt


import pandas as pd
import xarray as xr
from herbie import FastHerbie
import warnings
warnings.filterwarnings("ignore")

#%% Create a range of dates
DATES = pd.date_range("2024-03-11 00:00", periods=1, freq="12h")

#%% Create a range of forecast lead times. 
fxx = range(1, 840)

#%% Get forecast GRIB files
FH = FastHerbie(model="gefs",product="atmos.5",member="mean", DATES=DATES, fxx=fxx, max_threads=5, source='aws')

#%% Extract 2m temperature variables from GRIBs. NB temperatures are in Kelvin

dWthrVars = FH.xarray(":(TMP|TMAX|TMIN):2 m", remove_grib=False)

# Convert temperatures to Celsius

dTAvC = dWthrVars.t2m - 273.15
dTMaxC = dWthrVars.tmax - 273.15
dTMinC = dWthrVars.tmin - 273.15

# writing the new variables back to the main xarray so they're all in one place for eventual export to csv
dWthrVars['TAvC'] = dTAvC
dWthrVars['TMaxC'] = dTMaxC
dWthrVars['TMinC'] = dTMaxC

#%% Extract wind U- and V- vectors from GRIBs
dWindVectors = FH.xarray(":(U|V)GRD:10 m above", remove_grib=False)

#%% Calculate windspeed from vectors.
dWspd = (dWindVectors.u10**2 + dWindVectors.v10**2)**0.5

dWthrVars['Wspd'] = dWspd

#%% Calculate wind-adjusted temperatures

dTAvWadjC = dTAvC - dWspd/1.5
dTMaxWadjC = dTMaxC - dWspd/1.5
dTMinWadjC = dTMinC - dWspd/1.5

dWthrVars['TAvWadjC'] = dTAvWadjC
dWthrVars['TMaxWadjC'] = dTAvWadjC
dWthrVars['TMinWadjC'] = dTAvWadjC

#%% Calculate HDDs and CDDs from average temps

threshold = 16

dWthrVars['HDD'] = (threshold - dWthrVars['TAvC']).where(dWthrVars['TAvC'] < threshold, other='0')

dWthrVars['HDDwadj'] = (threshold - dWthrVars['TAvWadjC']).where(dWthrVars['TAvWadjC'] < threshold, other='0')

dWthrVars['CDD'] = (dWthrVars['TAvC'] - threshold).where(dWthrVars['TAvC'] > threshold, other='0')

#%% create data dump

Amsterdam_wthr = dWthrVars.sel(latitude=52.32, longitude=4.79, method = 'nearest').to_dataframe()
Beijing_wthr = dWthrVars.sel(latitude=39.93, longitude=116.28, method = 'nearest').to_dataframe()
Berlin_wthr = dWthrVars.sel(latitude=52.47, longitude=13.4, method = 'nearest').to_dataframe()
Brussels_wthr = dWthrVars.sel(latitude=50.9, longitude=4.53, method = 'nearest').to_dataframe()
Bucharest_wthr = dWthrVars.sel(latitude=44.51, longitude=26.08, method = 'nearest').to_dataframe()
Budapest_wthr = dWthrVars.sel(latitude=47.43, longitude=19.18, method = 'nearest').to_dataframe()
Essen_wthr = dWthrVars.sel(latitude=51.41, longitude=6.97, method = 'nearest').to_dataframe()
Lisbon_wthr = dWthrVars.sel(latitude=38.77, longitude=-9.13, method = 'nearest').to_dataframe()
London_wthr = dWthrVars.sel(latitude=51.48, longitude=-0.45, method = 'nearest').to_dataframe()
Madrid_wthr = dWthrVars.sel(latitude=40.47, longitude=-3.58, method = 'nearest').to_dataframe()
Milan_wthr = dWthrVars.sel(latitude=45.46, longitude=9.28, method = 'nearest').to_dataframe()
NewDelhi_wthr = dWthrVars.sel(latitude=28.58, longitude=77.2, method = 'nearest').to_dataframe()
Paris_wthr = dWthrVars.sel(latitude=48.72, longitude=2.38, method = 'nearest').to_dataframe()
Seoul_wthr = dWthrVars.sel(latitude=37.57, longitude=126.97, method = 'nearest').to_dataframe()
Tokyo_wthr = dWthrVars.sel(latitude=35.69, longitude=139.75, method = 'nearest').to_dataframe()
Warsaw_wthr = dWthrVars.sel(latitude=52.16, longitude=20.96, method = 'nearest').to_dataframe()

data_dump = pd.concat([Amsterdam_wthr, Beijing_wthr, Berlin_wthr, Brussels_wthr, Bucharest_wthr, Budapest_wthr, Essen_wthr, Lisbon_wthr, London_wthr, Madrid_wthr, Milan_wthr, NewDelhi_wthr, Paris_wthr, Seoul_wthr, Tokyo_wthr, Warsaw_wthr], keys=["Amsterdam", "Beijing", "Berlin", "Brussels", "Bucharest", "Budapest", "Essen", "Lisbon", "London", "Madrid", "Milan", "New Delhi", "Paris", "Seoul", "Tokyo", "Warsaw"])

data_dump.to_csv('GEFS_dump.csv', index=True)
print('GEFS CSV exported')
matt-h2o commented 3 months ago

When I run the process and go into the individual components of hte xarray, I can see that latitude is expressed as +/- 90, but longitude is expressed as 0 to 359.75. I need to check if the data for longitudes <180 correspond in the normal way - if so then presumably i can reindex the longitudes > 180

matt-h2o commented 3 months ago

This code fixes the longitudes:

def recalculate_longitude(longitude):
    return (longitude + 180) % 360 - 180

dWthrVars = dWthrVars.assign_coords(longitude=recalculate_longitude(dWthrVars.longitude))
dWthrVars = dWthrVars.sortby('longitude')
williamhobbs commented 2 months ago

I think that should be (longitude + 360) % 360.

Running this:

def recalculate_longitude(longitude):
    return (longitude + 180) % 360 - 180

def recalculate_longitude_v2(longitude):
    return (longitude + 360) % 360

longitude = -90
print(recalculate_longitude(longitude))
print(recalculate_longitude_v2(longitude))

I get:

-90
270

And I think you want the second one. Is that right?

blaylockbk commented 2 months ago

You might be interested in a new capability I've slowly been working on to extract data at a point. This is in the current main branch, but hasn't been released yet. When I get a chance I'd like to try it out for your case and see if it works, unless you beat me to it

https://herbie.readthedocs.io/en/latest/user_guide/_tutorial_notebooks/pick_points.html

matt-h2o commented 2 months ago

@williamhobbs Thanks Will.

The issue is actually that the longitude dimension as downloaded is only expressed as positive numbers from 0 to 360. So if I try to get the weather variables for Lisbon, at longitude -9, I end up with the variables for somewhere near Valencia, at longitude 0 as the nearest number to a negative value. So my code snipped is intended to turn 270 to -90 so that negative longitudes map correctly.

matt-h2o commented 2 months ago

@blaylockbk Thanks Brian - I came across that but hadn't quite figured out how to use it. I'm now getting a 404 so I imagine you might be updating it! Will come back later and take another look.

williamhobbs commented 2 months ago

@matt-h2o That makes sense - I ran into a similar issue with GEFS, where I was targeting longitudes in the range of -85 to -90 deg E (e.g., Atlanta, GA, USA).

But I ended up with a different conversion. For example, if I use your function:

def recalculate_longitude(longitude):
    return (longitude + 180) % 360 - 180

and input -9 (for Lisbon):

longitude = -9
print(recalculate_longitude(longitude))

I get this output:

-9

Alternatively, if you use this as a conversion function:

def recalculate_longitude_v2(longitude):
    return (longitude + 360) % 360

and input -9:

longitude = -9
print(recalculate_longitude_v2(longitude))

you get:

351

which I think is what you want.

matt-h2o commented 2 months ago

@williamhobbs Thanks. I've come at this the other direction - I've converted the longitude index in my xarray so that 351 is now -9, and then queried that index for the value at -9. I'd guess we'd end up with the same answer?

williamhobbs commented 2 months ago

Ah! Yes, I was thinking about it in the reverse. I'm converting -9 (conventional coordinates) to 351 (GEFS coordinates) and you are converting 351 (GEFS) to -9 (conventional). We do get the same answers.

Apologies for the confusion, but I guess it's all extra clear now 😉.