ldeo-glaciology / xapres

package for processing ApRES data using xarray
MIT License
3 stars 2 forks source link

Xarrays stored on bucket have profiles not formatted for complex64 #12

Closed glugeorge closed 1 year ago

glugeorge commented 1 year ago

I've been trying to unpack why our python scripts are so different from the MATLAB ones. Still working on converting the MATLAB vertical velocity calculations to python. But before I dive too deep into this, I realized that when I try loading in the existing xarrays stored as zarrs with this code: `def reload(site): filename = f'gs://ldeo-glaciology/apres/greenland/2022/single_zarrs/{site}' ds = xr.open_dataset(filename, engine='zarr', chunks={}) return ds

ds_101 = reload("A101")` ds_101 looks like this:

image

And indeed the profile variable has lost its complex component.

On the otherhand, if I use the load_all function to access the .DAT files directly from the cloud with this command: xa = ApRESDefs.xapres(max_range=1400) xa.load_all(directory='gs://ldeo-glaciology/GL_apres_2022/A101', remote_load = True, file_numbers_to_process=range(0,2) ) xa.data The complex nature of the profile is preserved.

image

So it seems like there's some bug that only keeps the real portion of the profile during the process of converting them into zarrs.

glugeorge commented 1 year ago

It could also be an issue with loading it in, I'm not too familiar with how to access zarrs

jkingslake commented 1 year ago

So they come out as reals when you load in the zarrs? Or are they complex numbers with a different precision than 64 bit?

On Sun, Jan 29, 2023, 5:59 PM George Lu @.***> wrote:

It could also be an issue with loading it in, I'm not too familiar with how to access zarrs

— Reply to this email directly, view it on GitHub https://github.com/ldeo-glaciology/xapres_package/issues/12#issuecomment-1407794268, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTXJ3PABAFAV7UVN53XO3LWU3Y4HANCNFSM6AAAAAAUKP5WZQ . You are receiving this because you were assigned.Message ID: @.***>

glugeorge commented 1 year ago

They come out as reals. Even though the description says complex-type, the actual values are just real floats.

image
glugeorge commented 1 year ago

I found the bug. In the 3rd cell of write_big_zarrs.ipynb:

image

In write_big_zarrs, it assigns everything float64. I think we may need to re-run this process to get xarrays with the proper complex128 for the profile and profile_stacked variables

Quick fix for this would be to add: encoding['profile']= {"dtype": "complex128"} encoding['profile_stacked']= {"dtype": "complex128"} after the buggy line above

glugeorge commented 1 year ago

Removed encoding line in write_big_zarrs.ipynb. New zarrs can be accessed with: def reload(site): filename = f'gs://ldeo-glaciology/apres/greenland/2022/single_zarrs_noencode/{site}' ds = xr.open_dataset(filename, engine='zarr', chunks={}) return ds

ds_101 = reload("A101") ds_103 = reload("A103") ds_104 = reload("A104")