Depth normalisation - Githubissues

adele-morrison commented 7 months ago

As discussed in the hackathon today, different points on the isobath have different max depths, due to the discrete horizontal grid. Currently we are just doing straight depth averaging for the correlation plots. This means that some bottom points with depth 800m and being averaged with other points at 800m that are actually 400m above the bottom (where max depth = 1200m). It may be better to normalise depth and then average, so we always average bottom points together etc.

@ongqingyee is going to check what difference this makes for Figure 2.

taimoorsohail commented 7 months ago

For reference - this is what the monthly CSHT and U_along look like in terms of their max depth:

Screenshot 2024-04-11 at 12 28 44 PM

This data is pulled from

# Load the along-slope velocity field
u_along = xr.open_mfdataset('/g/data/v45/wf4500/ASC_project_files/Binned_ASC_speed/OM2_IAF/Binned_Antarctic_slope_contour_1km_velocities*')
# Load CSHT field
CSHT_along = xr.open_mfdataset(\
    '/g/data/v45/wf4500/ASC_project_files/Cross_slope_heat_transport/OM2_IAF/daily_z/*')

willaguiar commented 7 months ago

Ok, there seem to be a big mismatch in the isobath between the CSHT and U ~-80. Im gonna check the code that produced the CSHT to see if the same isobath is used.

PS: Is the plot on the left with the binned or unbinned CSHT? @taimoorsohail

adele-morrison commented 7 months ago

I don't think I was following the discussion on this closely before. Why is the max depth of the CSHT and u_along different at each longitude? I thought we used the same isobath for both?

taimoorsohail commented 7 months ago

Yes, this is also a point of confusion for me. I've highlighted the files I drew this monthly data from above, so Will if you could check out why they are different that would be great!

I won't produce the normalised, monthly-averaged u_along and CSHT files until this mismatch is resolved @Ellie Ong @.***>.

On Thu, 11 Apr 2024, 1:36 pm Adele Morrison, @.***> wrote:

I don't think I was following the discussion on this closely before. Why is the max depth of the CSHT and u_along different at each longitude? I thought we used the same isobath for both?

— Reply to this email directly, view it on GitHub https://github.com/willaguiar/ASC_and_heat_transport/issues/31#issuecomment-2048872502, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE54VPA4GPNTM5A3BGGOTDLY4YAK5AVCNFSM6AAAAABGBNJ5RCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYHA3TENJQGI . You are receiving this because you were mentioned.Message ID: @.***>

taimoorsohail commented 7 months ago

@willaguiar The data isn't binned, it's the raw longitudes I believe EDIT: It's the binned data - sorry for the confusion!

willaguiar commented 7 months ago

ok, I checked the isobath used in the ASC code, line 291 and in the CSHT, line 260 and they come from the same file.

I also plotted the vars in the dirs pointed above. Even tho they don't look identical, the binned variables look very similar

@taimoorsohail did you use a specific time slice for that plot? My plots above are averages, so perhaps they are missing some peak deep value happening in one time slice, or missing something else.

edit with unbinned data:

I don't see that deep value in the unbinned data either (Although another problem I see is that the unbinned seem to go shallower than the binned one)

taimoorsohail commented 7 months ago

Thanks @willaguiar! I just plotted the first month in the monthly binned data for CSHT and U_along. Shouldn't they have the same isobath definition regardless of if it is time-averaged or if it is instantaneous? Here's the code snippet that pulls the CSHT, does the monthly averaging, and plots the first month:

CSHT_along = xr.open_mfdataset(\
    '/g/data/v45/wf4500/ASC_project_files/Cross_slope_heat_transport/OM2_IAF/daily_z/*')
CSHT_along_monthly = CSHT_along.resample(time='1M').mean()
CSHT_months = ((CSHT_along_monthly.binned_cross_slope_heat_trans+CSHT_along_monthly.zonal_convergence)*0.08).rename({'st_ocean':'depth', 'lon_bin_midpoints':'lon'})

fig, axs = plt.subplots(nrows=1, ncols=2, figsize=(10,5))
axs = axs.ravel()
plt.subplots_adjust(hspace = 1.2, wspace=0.2)

CSHT_months.where(CSHT_months!=0).isel(time=0).plot(ax=axs[0], cmap = plt.cm.Greens)
axs[0].invert_yaxis()
axs[0].set_ylim(2000,0)
axs[0].set_title('CSHT')
u_along_da_monthly.isel(time=0).plot(ax=axs[1], cmap = plt.cm.Blues)
axs[1].invert_yaxis()
axs[1].set_ylim(2000,0)
axs[1].set_title('U_along')
plt.show()

It could have to do with the fact that I add the zonal convergence, which could go deeper, to get the total CSHT?

willaguiar commented 7 months ago

Ok, seems to me that those deep values come from the Zonal Convergence..... not sure why tho.... ( checking it)

willaguiar commented 7 months ago

ok, I was looking at the code for the Zonal Convergence. The ZC is calculated from the difference x-transports not on the 1000m isobath, but at the points adjacent to the edges of each 3deg bin. So I think what is happening here is that the depths at the edge of certain bins can be bigger than the depth of the points inside the bins, creating the deep ZC values. Does that make sense @adele-morrison ?

If so perhaps the best thing to do is to mask out any ZC point that do not have CSHT in it

willaguiar commented 7 months ago

I talked to @adele-morrison in person on that issue today. It seems reasonable that the ZC would have deeper values, as they are not taken from the exact same points as the Binned CSHT. The only problem that could arise is in the vertical profile analysis ( where values z>1200m would come from pure ZC), but since we are cutting of our analysis at ~1000m depth then that should not be a problem.

One thing to think about it tho is whether having these deep ZC-driven transports would affect the GMM classification ( probably not much? - @taimoorsohail )

taimoorsohail commented 7 months ago

Hmmm, does that mean we aren't going to vertically normalise the CSHT and U_along profiles anymore, and will instead cut them off at exactly 1000m? I think as we are calculating correlations between U_along and CSHT+ZC, the values of ZC that are deeper than ~1000m will not contribute to the correlation at all (as U_along will be NaN at these points). So why don't we just NaN out CSHT+ZC where the contribution is ZC only, then normalise the resulting profiles of CSHT+ZC and U_along?

On Fri, 12 Apr 2024, 10:43 am Wilton Aguiar, @.***> wrote:

I talked to @adele-morrison https://github.com/adele-morrison in person on that issue today. It seems reasonable that the ZC would have deeper values, as they are not taken from the exact same points as the Binned CSHT. The only problem that could arise is in the vertical profile analysis ( where values z>1200m would come from pure ZC), but since we are cutting of our analysis at ~1000m depth then that should not be a problem.

One thing to think about it tho is whether having these deep ZC-driven transports would affect the GMM classification ( probably not much? - @taimoorsohail https://github.com/taimoorsohail )

— Reply to this email directly, view it on GitHub https://github.com/willaguiar/ASC_and_heat_transport/issues/31#issuecomment-2050772817, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE54VPHPHWWOR52OUPYVFK3Y44U4VAVCNFSM6AAAAABGBNJ5RCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJQG43TEOBRG4 . You are receiving this because you were mentioned.Message ID: @.***>

adele-morrison commented 7 months ago

Looking at the unbinned plot of CSHT above, it has a much larger range of max depths than the binned. I think it would still be good to test the normalisation, but we definitely want to use the max depths from the unbinned data to do that. The binning process doesn't change the depth of the CSHT values, it just possibly adds in deeper values from the ZC terms. So perhaps we could test throwing out any ZC data deeper than the max depth in the unbinned CSHT and normalising what's left that's shallower than the max depth of the unbinned CSHT?

taimoorsohail commented 7 months ago

OK - that sounds like a good plan. @willaguiar do you have the path to the unbinned CSHT and U_along (which you plotted above)? I will try to do the normalisation with the unbinned CSHT+ZC data, throwing out the values of ZC that are deeper than the max depth of CSHT at each longitude.

Just confirming - does this mean the daily and monthly analyses should be done with unbinned data rather than the binned data? Shifting from binned to unbinned data (even if we cut off the correlation at 1000m, removing the influence of the deep ZC values) will have an impact on the correlations, right?

taimoorsohail commented 7 months ago

One thing to think about it tho is whether having these deep ZC-driven transports would affect the GMM classification ( probably not much? - @taimoorsohail )

I classified with GMM using only the binned, monthly U_along values, so CSHT doesn't affect my classification at all.

willaguiar commented 7 months ago

I'm not sure I understand completely - is the idea to rerun the normalization + GMM with the unbinned data? the only problem is that the ZC is calculated in the edges of the bins, and therefore only exist in the binned data points. So Im not sure how would be possible to filter the ZC to data points where the unbinned CSHT exists.

In any case, here it goes the dirs for the unbinned data:

unbinned ASC speed: /g/data/v45/wf4500/ASC_project_files/ASC_speed/OM2_IAF/* CSHT should be the same files, but var: unbinned_heat_transp_across_contour

adele-morrison commented 7 months ago

Hmm, yeah, ok. What about finding the max depth for normalisation from the unbinned data in each bin? But then normalising the binned data. Could either find the mean isobath depth or max isobath depth in each bin?

taimoorsohail commented 7 months ago

But then we'd have a bunch of Nans at the bottom of the normalised profiles, which correspond to the deeper grid cells in the unbinned data that don't have any data in the binned array? If we are normalising the binned data at the end of the day, I don't see how incorporating the max depths from the unbinned CSHT will affect our result, other than adding a bunch of NaNs at the bottom of every normalised profile... maybe I'm misunderstanding...?

adele-morrison commented 7 months ago

Hmm I'm not sure I understand, maybe we should zoom?

adele-morrison commented 7 months ago

I think we'd have to use the max isobath depth in each bin, then any data beneath that comes from the ZC and we just throw that out when normalising.

taimoorsohail commented 7 months ago

Yes let's chat over Zoom.

taimoorsohail commented 7 months ago

OK Adele and I just Zoomed about this. It seems there are two steps we need to accomplish.

1) The binning process combines a series of adjacent, shallow CSHT profiles (see around -150 in the unbinned data). In the process of this binning, it may combine only one or two data points at the very bottom, adding noise to the solution. We need to try to avoid this by reducing the thickness of the bins in the longitude binning. However, we are concerned this will add noise, so we need to test how much of a difference this will make to the correlations. I will make an issue about this now.

2) If we do reduce our bin size without major impacts on our correlations, I will then normalise the newly binned CSHT+ZC and U_along profiles, taking care to mask out the depths where CSHT=0.

This will feed into the monthly analysis by @ongqingyee, the coarsening work by @fabiobdias and the daily analysis by @willaguiar [assuming this is all done using the binned data...?].

willaguiar / ASC_and_heat_transport

Depth normalisation #31