Open mbauer288 opened 1 year ago
This is a bug. the nom_res is None, which should be mapped to a null string, not to the string "None". As a workaround, try setting the read_granule argument nom_res="".
Granules can have different resolution data, which implies a need to select between them. This is done via the argument _nomres, which stands for nominal_resolution. It is a string that is appended to the canonical variable names, i.e. if nom_res='750km', you'd refer to latitude_750km or latitude750km.
The fix for the Granule class is to check nom_res for None and set to "", if so.
Okay, I was toying with the "fix" to put something like:
nom_res_str = "" if nom_res is None else nom_res
or
nom_res_str = "" if self.nom_res is None else self.nom_res
at the top of certain granule methods.
This fixes the problem as you thought.
Hmm, I think I need a bit more help. So the previous problem is solved, but now I get an error forming the STARE dataframe.
gdf = starepandas.read_granule(granule_path, sidecar=True,
latlon=False, read_timestamp=False,
sidecar_path=sidecar_path,add_sids=False,
adapt_resolution=True, nom_res=None)
in granule.to_df(xy=False):
to_df(): self.lat.shape = (1800,) flattened (1800,)
to_df(): self.lon.shape = (3600,) flattened (3600,)
to_df(): self.sids.shape = (1800, 3600) flattened (6480000,)
to_df(): dtype = dtype('float64')
to_df(): self.data[key].shape = (1800, 3600) flattened (6480000,)
to_df(): dtype = dtype('float64')
to_df(): series.shape = (6480000,)
raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length
I haven't implemented the temporal part of the inputs yet, but I don't see how that gives this error.
The dataframe (Granule class I think) doesn't know how to integrate the 1D lat and lon and the 2D sids into a single frame. to_df will have to be specialized (or read_granule) to rectify lat and lon to the same size and shape as sids. This is a common issue if one has a simple 2D regular grid and one only keeps the 1D forms of the dimensional coordinates.
Alternatively, the sidecar constructor could write 2D lat and lon coordinate variables (commensurate with the data) and the existing Granule class (which was originally developed for Level 2 non-gridded data) would work. This approach takes more space in storage, but each grid type only needs one sidecar file.
One approach would be to make Granule smarter (i.e. guess how to rectify the coordinate arrays) or make end-users specialize the method or class.
Okay,
So something like
##
# Make a (lat=1800, lon=3600) mesh grid to match the sidecar SIDs array
# Use Cartesian indexing lat[i, j]
lon, lat = numpy.meshgrid(lon, lat, copy=True, indexing='xy')
sids (1800, 3600)
lat (1800) = [ -89.950 ... +89.950]
lat (3600) = [-179.950 ... +179.950]
lat (1800, 3600)
lon (1800, 3600)
i j lat lon
[ 0, 0]: -89.950 -179.950
[ 0, 1]: -89.950 -179.850
[ 0, 2]: -89.950 -179.750
...
[ 0, 1797]: -89.950 -0.250
[ 0, 1798]: -89.950 -0.150
[ 0, 1799]: -89.950 -0.050
...
[1799, 0]: +89.950 -179.950
[1799, 1]: +89.950 -179.850
[1799, 2]: +89.950 -179.750
...
[1799, 3597]: +89.950 +179.750
[1799, 3598]: +89.950 +179.850
[1799, 3599]: +89.950 +179.950
Yes, something along those lines should work.
That works, IMERG podding code test nearing completion...
@NiklasPhabian I'm working on a fix.
I'm having problems correctly using a sidecar file for a gridded dataset (IMERG). Here is the generalized problem.
1) I make a sidecar file for IMERG using STAREMaster_py ...
And this matches up with the dataset I'm aiming to pod:
2) But when I attempt to form a STAREPandas dataframe using it as so...
I get an error which leads me to that think I'm likely evoking the
starepandas.read_granule()
method incorrectly for this kind of sidecar (gridded data). Inside said method, I see that it gets the correct sidecar info, although theself.nom_res = None
parameter might be an issue.Here is where the error is raised (in
starepandas.read_granule()
).Any suggestions? Thanks.
Note,
starepandas.read_granule()
is inherited from theGranule
class via a class I created) as should be clear from the code snippet below.Then in my code:
I should add some potential dependency issues that might be at work here (e.g., Pandas 2.0).