Ashar25 / RainyDay

0 stars 1 forks source link

Issues being caused by mismatching spatial coordinates when using mrms data #41

Closed lassiterdc closed 4 months ago

lassiterdc commented 10 months ago

Reprex:

The MRMS data longitude is in positive degrees west (e.g., 285) and other functions, such as the Shapefile bbox function returns longitude in negative degrees west (e.g., -75). The first problem this causes is in the function RainyDay.findsubbox(inarea,variables,flist[0]):

Traceback (most recent call last):
  File "/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_Py3.py", line 882, in <module>
    rainprop.subextent,rainprop.subdimensions,latrange,lonrange=RainyDay.findsubbox(inarea,variables,flist[0])
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gpfs/gpfs0/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_utilities_Py3/RainyDay_functions.py", line 865, in findsubbox
    outrain[lon_name][0], outrain[lon_name][-1]
.............................. (traceback stuff)
IndexError: index 0 is out of bounds for axis 0 with size 0

This specific error can be resolved within the findsubbox function by adding 360 to longmin and longmax before defining the outrain variable, but that would just be a local solution to a more prevalent problem.

When I patched that issue, the next error cropped up by line 1040:

Traceback (most recent call last):
  File "/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_Py3.py", line 1040, in <module>
    xmin=np.min(np.where(np.sum(catmask,axis=0)!=0))
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...................... (traceback stuff)
ValueError: zero-size array to reduction operation minimum which has no identity

I suspect this has to do with the mismatched coordinate systems, but I'm honestly not sure. I printed some of the variables to hopefully help identify where the problem is coming from:

bndcoords
[283.41003418 284.          36.65000153  37.09000015]
###################################
rainprop
<__main__.GriddedRainProperties object at 0x2b1e23c24090>
###################################
rainprop.subdimensions
[45 60]
###################################
xdim
45
###################################
ydim
60
###################################
catmask.shape
(45, 60)
############################################
catmask
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
############################################
np.sum(catmask,axis=0)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
###############################################
np.sum(catmask,axis=0)!=0
[False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False]
###############################################
np.where(np.sum(catmask,axis=0)!=0)
(array([], dtype=int64),)
###############################################

P.S. I ran into another easily fixable bug: In my newer version of numpy, np.int and np.float have been deprecated. This resolves by replacing these with np.int32 and np.float32.

Ashar25 commented 10 months ago

I also ran into similar problem with the shape file I used for the domain. As you said it is the projection used in the MRMS. But then I used the domain as a box (by adding 360 to the longitudes) and I am running fine. This is something to do with how geopandas returns the min max of the shape file used. I never saw the second error. I believe it is something to do with the area type you are using. Looks like the cat mask is empty .Let me see the JSON file you are using and I can check it out.

On Fri, Sep 1, 2023, 13:05 Daniel Lassiter @.***> wrote:

Reprex:

The MRMS data longitude is in positive degrees west (e.g., 285) and other functions, such as the Shapefile bbox function returns longitude in negative degrees west (e.g., -75). The first problem this causes is in the function RainyDay.findsubbox(inarea,variables,flist[0]):

Traceback (most recent call last): File "/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_Py3.py", line 882, in rainprop.subextent,rainprop.subdimensions,latrange,lonrange=RainyDay.findsubbox(inarea,variables,flist[0]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/gpfs/gpfs0/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_utilities_Py3/RainyDay_functions.py", line 865, in findsubbox outrain[lon_name][0], outrain[lon_name][-1] .............................. (traceback stuff)IndexError: index 0 is out of bounds for axis 0 with size 0

This specific error can be resolved within the findsubbox function by adding 360 to longmin and longmax before defining the outrain variable, but that would just be a local solution to a more prevalent problem.

When I patched that issue, the next error cropped up by line 1040:

Traceback (most recent call last): File "/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_Py3.py", line 1040, in xmin=np.min(np.where(np.sum(catmask,axis=0)!=0)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...................... (traceback stuff)ValueError: zero-size array to reduction operation minimum which has no identity

I suspect this has to do with the mismatched coordinate systems, but I'm honestly not sure. I printed some of the variables to hopefully help identify where the problem is coming from:

bndcoords [283.41003418 284. 36.65000153 37.09000015]###################################rainprop<main.GriddedRainProperties object at 0x2b1e23c24090>###################################rainprop.subdimensions [45 60]###################################xdim45###################################ydim60###################################catmask.shape (45, 60)############################################catmask [[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]]############################################np.sum(catmask,axis=0) [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.

                                              1. 0.
                      1. 0.]###############################################np.sum(catmask,axis=0)!=0 [False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False False]###############################################np.where(np.sum(catmask,axis=0)!=0) (array([], dtype=int64),)###############################################

P.S. I ran into another easily fixable bug: In my newer version of numpy, np.int and np.float have been deprecated. This resolves by replacing these with np.int32 and np.float32.

— Reply to this email directly, view it on GitHub https://github.com/Ashar25/RainyDay/issues/41, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYU4O4DWKWFZ5HOY7R5U73TXYIPXZANCNFSM6AAAAAA4H4Q4SA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

lassiterdc commented 10 months ago

The JSON file is in the dropbox folder I linked to in the first post.

Could you paste the line of code you modified to add 360 degrees to convert the units?

Ashar25 commented 10 months ago

I did not change the code. I basically did what you did changing the code can mess up other datasets.

On Fri, Sep 1, 2023, 14:00 Daniel Lassiter @.***> wrote:

The JSON file is in the dropbox folder I linked to in the first post.

Could you paste the line of code you modified to add 360 degrees to convert the units?

— Reply to this email directly, view it on GitHub https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703200453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYU4O4GSLV6NBRDCOG7AB4TXYIWDFANCNFSM6AAAAAA4H4Q4SA . You are receiving this because you commented.Message ID: @.***>

lassiterdc commented 10 months ago

What I'm suggesting is really to make the code to robust to both coordinate systems. I'm concerned about breaking stuff by hardcoding anything specific to my dataset.

Ashar25 commented 10 months ago

Yeah, I have been trying to do this past 3 months lol!. The good news is I ran the MRMS dataset for the Madison case study considering the domain type as rectangular box and then giving longitudes based on your dataset. I also used the area of interest as box type. I was able to make storm catalogs but they are just 10 storms without running into memory issues and it took 2 hours to run one year of dataset. I am more concerned about the second issue than the first one.

On Sep 1, 2023, at 2:34 PM, Daniel Lassiter @.***> wrote:

What I'm suggesting is really to make the code to robust to both coordinate systems

— Reply to this email directly, view it on GitHub https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703236741, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYU4O4EARWM4EDBL7OOTLZDXYI2F7ANCNFSM6AAAAAA4H4Q4SA. You are receiving this because you commented.

lassiterdc commented 10 months ago

I may have fixed it. After adding these two lines to the findsubbox function and the readnetcdf function, I made it past that error.

    if max(infile[lon_name].values) > 180: # convert from positive degrees west to negative degrees west
        infile[lon_name] = infile[lon_name] - 360

Now it's in the phase of processing each netcdf file. The only odd thing now is that it started at 2001 even through I had "INCLUDEYEARS" : [2015]. Does that have to be a string?

Ashar25 commented 10 months ago

Yeah hopefully it will work.

On Sep 1, 2023, at 3:17 PM, Daniel Lassiter @.***> wrote:

I may have fixed it. If add these two lines to the findsubbox function and the readnetcdf function, I made it past that error.

if max(infile[lon_name].values) > 180: # convert from positive degrees west to negative degrees west
    infile[lon_name] = infile[lon_name] - 360

— Reply to this email directly, view it on GitHub https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703276033, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYU4O4ASV7JKEDGLNJIXNDLXYI7EBANCNFSM6AAAAAA4H4Q4SA. You are receiving this because you commented.

lassiterdc commented 10 months ago

Oh it didn't like that:

Traceback (most recent call last):
  File "/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_Py3.py", line 867, in <module>
    flist,nyears=RainyDay.createfilelist(inpath,includeyears,excludemonths)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gpfs/gpfs0/project/quinnlab/dcl3nd/norfolk/stormy/stochastic_storm_transposition/RainyDay2/Source/RainyDay_utilities_Py3/RainyDay_functions.py", line 1597, in createfilelist
    if file_year in includeyears and file_month not in excludemonths:
       ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'in <string>' requires string as left operand, not int

I was able to fix it by formatting the argument in the JSON as a list and modifying the following code chunk:

try:
    includeyr=cardinfo["INCLUDEYEARS"]
    if includeyr.lower()!="all":
        includeyears = includeyr
    else:
        includeyears=False
except Exception:
    includeyears=False

to:

try:
    includeyears=cardinfo["INCLUDEYEARS"]
except Exception:
    includeyears=False

The problem was that the try - except statement was silencing the error when trying to run includeyr.lower() on a list or an integer.

It now seems to be running and is processing 2015 files!

Ashar25 commented 10 months ago

Oh man let me see the code you are using first.

Ashar25 commented 10 months ago

It should be a list by default. Let me know which branch of the repository you have pulled.

lassiterdc commented 10 months ago

Just updated my comment above to show my solution!

Ashar25 commented 10 months ago

I don't think you are using the right code. 😆

lassiterdc commented 10 months ago

RainyDay/Source at Onestormperfile · Ashar25/RainyDaygithub.comI am using the code from this repo. This is the link Dan sent originally. Is this the wrong one?Sent from my iPhoneOn Sep 1, 2023, at 5:00 PM, Ashar25 @.***> wrote: I don't think you are using the right code. 😆

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

Ashar25 commented 10 months ago

I think there is a mistake. Use the 'Rainydayrefctoring' one. This one is the most updated version.

On Fri, Sep 1, 2023, 19:34 Daniel Lassiter @.***> wrote:

RainyDay/Source at Onestormperfile · Ashar25/RainyDaygithub.comI am using the code from this repo. This is the link Dan sent originally. Is this the wrong one?Sent from my iPhoneOn Sep 1, 2023, at 5:00 PM, Ashar25 @.***> wrote: I don't think you are using the right code. 😆

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703577745, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYU4O4E5EUUO546FMPTNFNDXYJ5JRANCNFSM6AAAAAA4H4Q4SA . You are receiving this because you commented.Message ID: @.***>

lassiterdc commented 10 months ago

Can you send me a link to the exact source folder I should be using please?Sent from my iPhoneOn Sep 1, 2023, at 9:49 PM, Ashar25 @.***> wrote: I think there is a mistake. Use the 'Rainydayrefctoring' one. This one is

the most updated version.

On Fri, Sep 1, 2023, 19:34 Daniel Lassiter @.***> wrote:

RainyDay/Source at Onestormperfile · Ashar25/RainyDaygithub.comI am using

the code from this repo. This is the link Dan sent originally. Is this the

wrong one?Sent from my iPhoneOn Sep 1, 2023, at 5:00 PM, Ashar25

@.***> wrote:

I don't think you are using the right code. 😆

—Reply to this email directly, view it on GitHub, or unsubscribe.You are

receiving this because you authored the thread.Message ID: @.***>

Reply to this email directly, view it on GitHub

https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703577745,

or unsubscribe

https://github.com/notifications/unsubscribe-auth/AYU4O4E5EUUO546FMPTNFNDXYJ5JRANCNFSM6AAAAAA4H4Q4SA

.

You are receiving this because you commented.Message ID:

@.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

Ashar25 commented 10 months ago

https://github.com/Ashar25/RainyDay/tree/RainyDayRefctoring/Source

On Sat, Sep 2, 2023, 08:24 Daniel Lassiter @.***> wrote:

Can you send me a link to the exact source folder I should be using please?Sent from my iPhoneOn Sep 1, 2023, at 9:49 PM, Ashar25 @.***> wrote: I think there is a mistake. Use the 'Rainydayrefctoring' one. This one is

the most updated version.

On Fri, Sep 1, 2023, 19:34 Daniel Lassiter @.***> wrote:

RainyDay/Source at Onestormperfile · Ashar25/RainyDaygithub.comI am using

the code from this repo. This is the link Dan sent originally. Is this the

wrong one?Sent from my iPhoneOn Sep 1, 2023, at 5:00 PM, Ashar25

@.***> wrote:

I don't think you are using the right code. 😆

—Reply to this email directly, view it on GitHub, or unsubscribe.You are

receiving this because you authored the thread.Message ID: @.***>

Reply to this email directly, view it on GitHub

https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703577745,

or unsubscribe

< https://github.com/notifications/unsubscribe-auth/AYU4O4E5EUUO546FMPTNFNDXYJ5JRANCNFSM6AAAAAA4H4Q4SA>

.

You are receiving this because you commented.Message ID:

@.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/Ashar25/RainyDay/issues/41#issuecomment-1703833296, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYU4O4E6T664N7VUIOZKCUDXYMXRHANCNFSM6AAAAAA4H4Q4SA . You are receiving this because you commented.Message ID: @.***>

lassiterdc commented 10 months ago

The issue was still present in that version but I was able to fix it and complete an error-free run that seems to have successfully generated a storm catalog.

I added a couple lines of code in two locations in RainyDay_functions.py that would fix this in a way that is robust for different datasets. The modifications are the last two lines in each code chunk. I didn't wanna give line numbers since I've made some other unrelated modifications that may have changed the spacing.

def findsubbox(inarea,variables,fname):
    outextent = np.empty([4])
    outdim=np.empty([2], dtype= 'int')
    infile=xr.open_dataset(fname)
    latmin,latmax,longmin,longmax = inarea[2],inarea[3],inarea[0],inarea[1]
    rain_name,lat_name,lon_name = variables.values()
    # mod
    if max(infile[lon_name].values) > 180: # convert from positive degrees west to negative degrees west
        infile[lon_name] = infile[lon_name] - 360
def readnetcdf(rfile,variables,inbounds=False,dropvars=False):
    """
    Used to trim the dataset with defined inbounds or transposition domain

    Parameters
    ----------
    rfile : Dataset file path ('.nc' file)
        This is the path to the dataset
    variables : TYPE
        DESCRIPTION.
    inbounds : TYPE, optional
        DESCRIPTION. The default is False.

    Returns
    -------
    TYPE
        DESCRIPTION.

    """
    if dropvars==False:
        infile=xr.open_dataset(rfile)
    else:
        infile=xr.open_dataset(rfile,drop_variables=dropvars)  # added DBW 07282023 to avoid reading in unnecessary variables
    rain_name,lat_name,lon_name = variables.values()
    # mod
    if max(infile[lon_name].values) > 180: # convert from positive degrees west to negative degrees west
        infile[lon_name] = infile[lon_name] - 360 
Ashar25 commented 10 months ago

Thanks for rectifying the issue. I can add these two lines myself and then push them to the repository. We can close this issue once we test it. I am hopeful this will just work fine.