ldeo-glaciology / xapres

package for processing ApRES data using xarray
MIT License
3 stars 2 forks source link

xa.load_all not loading files past the first #8

Closed glugeorge closed 1 year ago

glugeorge commented 1 year ago

xa.load_all(directory='gs://ldeo-glaciology/GL_apres_2022', remote_load = True, file_numbers_to_process = [1], bursts_to_process=[0,1] ^This works xa.load_all(directory='gs://ldeo-glaciology/GL_apres_2022', remote_load = True, file_numbers_to_process = [1,1], bursts_to_process=[0,1] As does this

But: xa.load_all(directory='gs://ldeo-glaciology/GL_apres_2022', remote_load = True, file_numbers_to_process = [1,2], bursts_to_process=[0,1] and xa.load_all(directory='gs://ldeo-glaciology/GL_apres_2022', remote_load = True, file_numbers_to_process = [2], bursts_to_process=[0,1] Do not. Seems like this is the case with all the other files as well. I did make some changes, but the changes (adding uncertainty) seem to be fine when loading just the one.

glugeorge commented 1 year ago

I will add my changes as a pull request shortly and also discuss them with @jkingslake - there's some structural changes and some questions relating to how uncertainty comes into play. I think we are close to getting a series of vertical velocity measurements

jkingslake commented 1 year ago

The NBs could be out of date in terms of which directories they are looking in (I moved around the ApRES DAT files in the bucket towards the end of all that work). The code snippet here has the directory to load from. Can you try the same tests loading from there?

glugeorge commented 1 year ago

This is loading in already saved xarrays as zarrs, if im interpreting this correctly? Does this mean we are sticking to the existing xarray structure, meaning that I shouldnt make any changes to functions like _burst_to_xarray in the xarray class? This is fine, I can adapt to using this structure for things. I had initially been writing my update to include uncertainty in the xarray such that it is done from the stage where we're still loading in the dat files.

jkingslake commented 1 year ago

oh yeah, you are right!

jkingslake commented 1 year ago

What's the traceback for the ones which dont work?

glugeorge commented 1 year ago

I think I've reverted back to the existing scripts and yeah it still hangs. filepaths = xa.list_files(directory='gs://ldeo-glaciology/GL_apres_2022', remote_load = True) This shows that there are 386 files available

xa = ApRESDefs.xapres(loglevel='debug', max_range=1400) xa.load_all(directory='gs://ldeo-glaciology/GL_apres_2022', remote_load = True, file_numbers_to_process = [1,2], bursts_to_process=[0,1] ) This is the script that gets stuck, more specifically at ApRESDefs.py @function load_all line 230 - Load dat file ldeo-glaciology/GL_apres_2022/A101/CardA/DIR2022-05-26-1536/DATA2022-05-27-1506.DAT

There's no traceback error, it just doesn't continue past here. Doing some digging around it looks like it's the load_dat_file that gets stuck.

It's weird because the first file gets loaded fine. If you have the time, can you see if you're able to load as well? It could just be some odd network issue or something

glugeorge commented 1 year ago

Ok seems like I am just impatient, it just takes a while to load (still unusual, doesnt usually take this long). Generally resolved though.

I think with this being the speed it is, it makes more sense to leave the already created xarrays on the bucket and just move forward loading those in and doing additional processing for my analysis?

jkingslake commented 1 year ago

OK, I was just writing back to say that it is working for me, but the second file is larger, so maybe it's taking longer.

I think with this being the speed it is, it makes more sense to leave the already created xarrays on the bucket and just move forward loading those in and doing additional processing for my analysis?

It does take 12 hours or something to run the whole thing, but does the speed make a big difference to you though? You could test the new capabilities on a subset then rerun the whole lot when you are happy, right?

glugeorge commented 1 year ago

Yeah that was my plan initially. I'll keep doing what I have so far then and adapt it if needed. I think either way will end up with more or less the same time consumption