Closed derekpickell closed 2 weeks ago
Hello @derekpickell! I just wanted to acknowledge I saw this post (thanks for reaching out!) and am wondering why this is unexpected behavior? It could be that adding h_rms_misfit
is increasing one of the dataset dimensions, which would tend to increase the number of nans
as Xarray pads out the data to take this new shape.
A few questions that will make it easier for me to diagnose if there's an issue:
h_rms_misfit
variable specifically, or any number of variables >8?Hi @JessicaS11,
Thank you for the response! To answer your questions:
Thanks for these answers. I've dug in a bit more and now suspect that it is not the number of variables you're playing with, but which variables. The note on which ones you've experimented with was a clue. h_rms_misfit
, bsnow_h
, and cloud_flg_asr
are all more deeply nested variables than (for instance) h_li
(if you look at the variable paths, they have either geophysical
or fit_statistics
after the land_ice_segments
layer. If you look at the resulting dataset for a single file after reading in two versus three of the above specific variables, the coordinates attached to the variable are different. What's happening behind the scenes is essentially icepyx is doing all of the individual group reads with xarray and then trying to cleverly merge the per-group dataarrays together into one dataset. As you've noted, this doesn't always work! Handling (generically) the multiple layers of nesting is an ongoing challenge in icepyx, so thanks for reporting this case we missed.
I think I've isolated where in the code the issue is happening (lines 816-822 or so in the read module, so could also be in one of the functions called therein), but I haven't yet figured out what the solution might be (any suggestions welcome!). I'll continue to work on resolving this as time allows, but any assistance would be greatly appreciated.
Hello @derekpickell! I have good news and bad news. Good news is the bug I identified where all dimensions were not being applied to the deeper nested variables of interest is fixed via #623. Bad news is I don't think this was actually the problem you noted.
When I dug in further, I found a granule that only has nan
values for some variables. However, it seems like only bsnow_h
fits into this category, not cloud_flg_asr
or h_rms_misfit
. If I'm not mistaken, in some situations the blowing snow algorithm is unable to confidently quantify blowing snow, which would result in no blowing snow values. @mikala-nsidc (ICESat-2 support specialist at NSIDC) or @tsutterley (one of the ATL06 product leads), can you confirm that in some cases no bsnow_h
(and thus all nan
s) is expected behavior for ATL06 granules?
@JessicaS11 wow amazing thank you. It looks like everything 'makes sense' with the data I am looking at: few nans here and there, but no large gaps where I wouldn't expect them.
@derekpickell Excellent! I'm going to close this issue as resolved, but feel free to comment again if need be. Would you be able/willing to do a PR review for #623?
Hi there,
I'm playing around with a basic read of locally downloaded .h5 files:
It seems when I add just one more variable to the list, e.g., 'h_rms_misfit', the number of 'nans' in the returned 'ds' xarray increases for no apparent reason, sometimes for all variables.
icepyx v1.3.0
Thank you!