SUMMER 2023: Roadmap/Project Organization

ocean-transport / argo-intern

Andrew's project

1 stars 1 forks source link

SUMMER 2023: Roadmap/Project Organization #10

Closed andrewfagerheim closed 1 year ago

andrewfagerheim commented 1 year ago

This issue contains meeting notes, action items, and updates starting May 2023. For similar issues from previous semesters see:

andrewfagerheim commented 1 year ago

Current List of Tasks, 15 May 2023

Admin:

[x] apply to UMass Boston BGC-Argo workshop [due 20 May] [responses document]
[x] apply to UW Seattle oceanography program [due 01 June] [responses document]
[x] figure out how to get paid, the email chains between me, Yana, Monica, and Dante are getting wildly confusing

Functions:

[x] create function for loading boxes that will:
- load points based on given latitude, longitude, and depth
- convert the points to xarray
- convert from points to a profile
- interpolate based on given depths and sample rate (to a 2m grid)
- add spice as a variable
[x] make functions to quickly add main plots:
- 4 panels from CCS poster (mean density, mean spice, isopycnal displacement, spice variance)
- EKE with box
- T-S plot
[x] ensure docstrings are comprehensive for all functions

Project:

[x] use Dhruv's function from glider paper to convert from depth to density space
[x] create plots similar to Figure 2:
- depth vs temperature
- density vs temperature
- "density-depth" vs spice anomaly
[x] make functions to quickly make a panel of the above plots

andrewfagerheim commented 1 year ago

Roadblocks 18 May 2023

trying to make the spice anomaly plot, but running into issues, primarily with the line: Pmean_nonan = Pmean_smooth.where(~np.isnan(so_rho['SPICE']) & ~np.isnan(Pmean_smooth), drop=True) because Pmean_smooth only has a rho_grid coordinate but so_rho has rho_grid and N_PROF_NEW coordinates. It produces the error: Dimensions {'N_PROF_NEW'} do not exist. Expected one or more of ('rho_grid',)
my first thought was to run the Figure-2-sections_plot notebook to see what the coordinates of each xarray involved was. However, the glidertools module is not installed in gyre and when I tried to install it, I noticed the version of pip installed is waaaaay behind (version 9.0.1 when the current version is 23.1.2). When I tried to upgrade pip, I got an error saying permission was denied.
my next thought is to add N_PROF_NEW dimension to Pmean_smooth and hope that's enough for the .where() command to run. Nope this won't work because Pmean_smooth is defined as: Pmean_smooth = so_rho.PRES_INTERPOLATED.mean('N_PROF_NEW').rolling(rho_grid= 80, center=True).mean() so there's no N_PROF_NEW dimension to add back. It's an array with shape (1001,)
I wonder if it's possible to specify a dimension the .isnan() should operate on? If that makes sense. Or limit the .where() to only one dimension?
Additionally, I wanted to do replicate the same analysis by looking at individual float numbers instead of regional boxes, to have the same x axis of "distance" as the glider paper. However, the function Dhruv used to do this relies on glidertools.utils.distance(), so the function didn't run. I could probably write my own that works the same, but it seems preferable to find to install glidertools on gyre instead.
One solution to the glidertools issue is just going into the github repo, finding the necessary function, and copying that directly into the notebook. This is what I'm trying for glidertools.utils.distance() in argo_density_space

andrewfagerheim commented 1 year ago

Misc. Updates 19 May 2023

Glider paper plots:

Goal is to make the same plots as the glider paper and compare the results. Does the argo data provide similar results even though the resolution is worse and it's a box of profiles, not a continuous path?

[x] depth vs N_PROF, density plotted: I added this plot to help visualize where the density contours come from and what they look like.
[x] depth vs N_PROF, temp plotted: Plotted, keeps the density contours.
[x] density vs N_PROF, temp plotted: Also plotted, shows that density contours are now horizontal lines.
[x] isopyc. displ. vs N_PROF, spice anom plotted: Can't get this to work, modifications of glider functions have not been successful. (_UPDATE 23 May: May need to broadcast a coordinate (PRESINTERPOLATED?) so both xarrays have the same dimensions.)

Float to distance:

I think it would be interesting to pick a float that goes through the glider region and perform the above analysis with distance on the x axis (like the glider plots) instead of N_PROF.

[x] convert lat/lon to distance: Successfully adapted Dhruv's and glidertools functions to calculate distance.
[x] add distance to xarray: It's not obvious to me how distance is calculated, and I don't know why the distance dimension is so large, so I don't know how to usably add it to the xarray from a float (UPDATE 23 May: you should just rewrite the function. Not only will it probably be quicker, but you'll also understand what's happening better.)
[x] plot above for distance, not N_PROF: Perform same analysis, just using distance as the x axis instead of N_PROF

Packages in Gyre:

Still not sure how to install glidertools, I seem unable to install this package or update pip because I don't have the right permissions. (UPDATE 23 May: look at Ocean Transport Guide, there should be a section on this)

andrewfagerheim commented 1 year ago

23 May 2023: Meeting w @dhruvbalwada

Notes:

From Balwada paper: can also make plot 2e (vertical derivative of spice), but it probably won't be super meaningful because you're working with individual profiles.)
Loading data by float: because the resolution isn't as good as a glider, it probably won't show coherent patterns when profiles are taken every 10 days. The best chance would be to pick a float with a long time series (at least a year), which may at least show coherent seasonal variation.
For the glider, each profile has its own pressure grid, but for Argo I've already interpolated to a standard grid (may be the source of some function problems if there's an extra PRES_INTERPOLATED coordinate).
Smoothening in the function that computes mean isopycnal depth is because at the extremes of density, it will oscillate between having and not having profiles to work with.
For loading a new package (glidertools) look at the Ocean Transport Guide, there should be a section on how to do this
It takes a long time to run the depth --> density space function. Maybe think about saving a netcdf file for each region once you have a ds_rho for each box?

Project:

What we've been doing: picking one filter scale (100m) and looking at "eddy variance," roughly based on the Steinberg paper. You should reread it paying special attention to the storyline: What do they present? How do they present it? How are the methods built and explained?
Go back to Dhruv's notebooks from winter break (the T/S --> rho/s will be especially relevant). Are anomalies in spectra influenced by the base of the mixed layer, or is there something deeper going on? Approach should be a "sensitivity analysis," we want to isolate a certain signal and make sure it isn't sensitive to our imposed procedures/parameters, only existing ones.

Next steps:

[x] Finish items from 19 May update
- [x] mean isopycnal depth plot
- [x] function that plots Figure 2 panel
- [x] Figure 2 panel by float
[x] Save ds_rho for each box
[x] Reread Steinberg paper, keep above in mind
[x] Reread paper that plotted mean isopycnal displacement

andrewfagerheim commented 1 year ago

1 June 2023: Notes

Updates:

I made functions to return an xarray with a distance coordinate instead of N_PROF and all the plotting functions work well with this, as long as you enter dim1=distance
It seems to me like Dhruv's Spice_on_Pmean xarray/Figure 2c calculates spice, not the spice anomaly. Need more clarity on this. Regardless, I made what I think is a spice anomaly function that calculates the mean spice on each isopycnal, then subtracts each point from that mean to create what I'm calling a Spice Anomaly (along Pmean) plot.
Redid the argo_box_loader notebook with expanded functionality. It can now load argo data based on a box of lats/lons or a float_ID, and will save both ds_z and ds_rho (with SPICE already added). I'm in the process of going through box by box to redo the netcdf files with this approach
Seems like I have a decent handle on the 4 panels from the CCS poster and the 5 panels from the glider paper's Figure 2, but I'm struggling to see the broader narrative uniting everything. See below for thoughts on where to go next.

Next steps:

[x] Reread papers with the noted emphasis, and take notes in #4
- [x] Roullet paper: look at mean isopycnal displacement, compare to CCS plots (they use slightly different metric of standard deviation for most of the analysis)
- [x] Dove paper: look at how AOU variance compares with T/S variance (more broadly, how is variance contextualized?)
- [x] Steinberg paper: look at the narrative of EKE, how they present and analyze the metric (more broadly, what does EKE tell you about?)
[ ] Look through Dhruv's filtering/spectra notebooks, this is another piece that I'm not sure where it fits

andrewfagerheim commented 1 year ago

2 June 2023: Meeting w @dhruvbalwada

argo_density_space notebook:

Spice: For the glider paper, Dhruv calculated spice based on the glider's specific temperature gradient, which is why it's nicely centered around 0. In my notebooks, I compute spice using gsw's arbitrary temperature gradient, which is why the values are all negative. To make it easier to look at, I compute spice anomaly by subtracting the mean spice along isopycnal. (Maybe this distinction could be described as normalized spice vs spice anomaly?)
Loading by float profile: this seems to be more productive because there's a connection between adjacent profiles, instead of any present in one geographic box. In particular, float_ID=1901700 looks cool because it starts in a region with layered isopycnals and ends in a region where they are bunched near the surface. In between, there's hints of the kind of structure seen in the glider region, particularly when you look at the spice anomaly plot.

Steinberg paper:

This has a good framework for how to move from KE to EKE to examining specific features of seasonality and energy cascades. It seems reasonable to attempt applying this framework to variance of density and/or spice.
Figure 3: This plot, particularly panels a/b and e/f seem helpful in quantifying what density surfaces look like in a given region. They calculate mixed layer depth using: FINISH THIS THOUGHT
Figure 5: This approach seems interesting. It communicates lots of information: variance at different locations, filter scales, and seasons. We would additionally need to add something about depth, more on this below.
Figure 6: This plot choses one metric from Figure 5 (peak EKE month) and plots it globally at different scales, which helps track a "lag" associated with a cascade. The lack of a depth dimension is even more pressing here.

Many, many questions:

What kind of data should we load? Should we load boxes and average the profiles by season or does this obscure the signals we're looking for? Is it better to perform the analysis by float path so distance is preserved while still seeing changes across time, or will that be too variable for only one float?
How do we address depth? Maybe we have multiple plots for each location, one for different depths (selected by taking the mean in bins, or the cumulative variance in larger sections)? Or maybe we pick only one filter scale to look at, and compare depths for each plot instead of scales? Maybe we don't separate by month at all, and instead look at these metrics along float paths with contours, and note on the x axis approximately which seasons each profile is from?
What about the mixed layer? Does the steep slope just below the mixed layer impact the above analysis? To remove it, is it best to filter the whole profile but mask the boundaries? Or is it better to select the part of the profile already below the mixed layer and mask that boundary?
What about spice anomaly? When considering spice anomaly, does removing the mean spice from each isopycnal affect our study of small-scale variance? If so, does using a reference temperature from the data (instead of from gsw) minimize this?

Next steps:

[x] finalize new functions
- [x] write docstrings for every function that doesn't have one currently
- [x] generalize ds_anom so that it can be used for more than just spice/Pmean applications-
[x] complete (re)readings from 1 June
[x] pick something from below and start working on it

Long term:

[x] address metric specifics
- [x] tackle mixed layer questions above (how do we treat the boundary when filtering?)
- [x] tackle the spice anomaly questions above (is this a valid metric?)
- [x] plot spectra of two spice calculation methods -- do they look substantially different?
[x] mixed layer depth
- [x] find mixed layer depth (are values based on lat/lon/season, or are they calculated based on T/S/rho/s profiles?)
- [x] make a plot like Steinberg Figure 3 a/b, e/f for density and spice
[x] filtering scales
- [x] find out what scales make the most sense for this analysis
- [x] tackle the depth questions above (how to display variance at different depths?)
- [x] make a plot like Steinberg Figure 5 for one region
[ ] data selection
- [x] tackle the loading questions above (what data should we work with?)
- [x] compare regions where we would expect tracer transport to depth (southern ocean) and where we wouldn't (subtropical gyre)
- [x] make a plot like Steinberg Figure 5 for multiple regions
- [ ] make a plot like Steinberg Figure 6 to compare all regions

andrewfagerheim commented 1 year ago

8 June 2023: Update

I've spent most of this week catching up on readings, both papers I had skimmed a few months ago (Steinberg, Dove, Roullet) and new ones from the past week (Serazin, Chung). I think this has been helpful to consider how studies break a big topic down into something more specific to analyze, what kinds of methods they use to do so, and how they present this whole story.

In particular, each study had (a) one broad problem it was trying to constrain, (b) at least one proxy it was using, (c) and at least one method to analyze that proxy.

For Steinberg, it was surface energy budget; PE conversion, EKE; scale-aware variance, scale-aware filtering
For Roullet, it was internal energy budget/eddy turbulence; EAPE, isopycnal displacement; r.m.s. variance
For Dove, it was ocean ventilation; oxygen profiles, EKE; MLD, ΔAOU, finite-size Lyapunov Exponents
For Serazin, ??; density profiles - Upper Ocean Pycnocline; buoyancy frequency & peaks, stratification index
For Chung, it was ocean heat content; isopycnal heaving; density depth warping (DDW), linearization, and coherence

I'm realizing that for our project, I find it difficult to say what the broad problem is, the proxy is some part of temperature/salinity/density/spice profiles, and the methods are some combination of filtering/spectra/etc. I think it would be helpful for my mental organization to more carefully define each of these, and the steps necessary to address each.

andrewfagerheim commented 1 year ago

9 June 2023: Meeting w @dhruvbalwada

Notes:

Misc: Try the boundary again, but use argo profiles (idealized may be too nice), also consider how the filtering function interacts with nans for the boundary profile. Using gsw for spice then computing anomaly is probably the best method.
Chung: Methods seem interesting, time to narrow down into specifics. Reread with the following questions in mind:
- Warping: How is it actually calculated? How is it different from a simple anomaly? It would be ideal to either get the code from this thesis or from another similar example
- Isolation: Is it possible to separate the influences of isopycnal heave purely in the vertical direction and along-isopycnal stirring? I seem to remember their only suggestion was calculating warping in 3-dimensions, but check back on this.
- Heat Content: What barriers do they identify to calculating heat uptake globally?
Also should read Bindoff & McDougall to compare. How do they define isopycnal heaving? What do they say about separating the influences of heaving from each other?

Next steps:

[x] Reread Chung with the above specifics in mind.
[x] Read Bindoff & McDougall.
[x] Write up a short comparison between the methods of the two. Ideally this should include a first take at trying to replicate both methods with either the glider or argo data. _(see your reading notes in argoVault with Dhruv's comments added)
[x] Write draft of CCS grant application by Friday so you can review it with Dhruv

andrewfagerheim commented 1 year ago

13 June 2023: Updates

Notes:

I'm trying out Obsidian, it actually seems very helpful to manage readings and track connections/concepts between them. I'm pushing commits to this repo, so it should be possible to see all of my reading notes in .md files at least, or even to open the vault on another computer.
I read Bindoff & McDougall 1994 and took notes, particularly noting in the last section potential dialogue with Chung 2019.
I still need to reread Chung to take a closer look at methods, but I found (1) (2) blog posts which had helpful introductions to dynamic time (or in our case depth) warping. I need to read them in depth, but from the figures alone, it seems like our application might just require Euclidian distance and not warping if the reference and sample profiles are on the same pressure grid already?

andrewfagerheim commented 1 year ago

16 June 2023: Meeting w @dhruvbalwada

Notes:

Mixed layer: The Scripps file and our Argo data don't have an easy metric to match profiles (maybe time/date but that would be a pain), so drop this for now. If we need to work with MLD for boundary discussions, we can use a density threshold strategy, which should work for most places.
Spice methods: The reason things weren't match up at first on a pressure grid was because the 2·1000·alpha·dCT assumption is based on being on an isopycnal. When we take beta into account, the two methods look very similar. All of this being said, stick with calculating spice using gsw.spiciness0 and considering the anomaly (along pressure or isopycnals) if necessary. This seems to be a bit more robust.
Spectra: Initially we were seeing the slope max out around -2, which is suggestive of a problem with windowing (non-periodic functions might have a large gap when being put end-to-end that return the slope of the window being applied). When we tried window='hamming' this changed the slope (closer to -3) and also aligned the spectra for each method closer to each other.
Filtering & scales: The Steinberg conventions are confusing, so Dhruv wrote out new ones (pictures attached). This presents a framework for quantifying variance contributed at different "bands" of filter scales. This kind of analysis is required to make Figure 5 of that paper.
Spice & density: Dhruv had a thought about trying to separate the impact of internal waves/isopycnal heave from tracer transport. Here's the general framework:
- Calculate variance of all scales smaller than a lower bound and all scales larger than an upper bound
- Calculate the ratio of smallest variance : largest variance (Is this always less than 1?)
- Compare this ratio for density and spice. Hypothesis is that for profiles with lots of "wiggles" in spice, the spice ratio will be greater than the density ratio, which would suggest there is tracer stirring separate from isopycnal heave

Next steps:

[x] find a few interesting density/spice profiles () and:
- [x] plot the spectra of density and spice (do we see more small-scale variance in spice?)
- [x] calculate variance ratios through above framework
[x] plot something similar to Steinberg Figure 5 (don't worry about rushing to this though, really spend some time testing this ratio method and selecting good profiles to test it with)

andrewfagerheim commented 1 year ago

21 June 2023: Meeting w @dhruvbalwada

These notes are very stream of consciousness (and written on the bus), retype later)

Notes:

spectra:
- seems like there's more nuance here than I recognized at first, namely that there are small peaks around (and both slightly higher and slightly lower than) 100m
- spectra will look noisier with a single profile, but these little hints suggest we're probably looking in the right direction
ratio/'binning' method
- we haven't taken into account their EKE definition at all when looking at bins, which is probably affecting other metrics we look at (namely ratios and Figure 5 style plot)
- I'm still not exactly sure how to incorporate this into the system, so look at the notes Dhruv wrote down
Steinberg Figure 5:
- calculate based on revised 'binning' definitions that take into account EKE
- calculate based on EKE definition for a single scale

Next steps:

[x] reconsider 'binning' method that considers EKE
[x] look at ratio method again considering:
- [x] new 'binning' approach
- [x] EKE at individual scales approach
[x] look at Steinberg Figure 5 plot again considering:
- [x] new 'binning' approach
- [x] EKE at individual scales approach

andrewfagerheim commented 1 year ago

28 June 2023: Meeting w @dhruvbalwada

Notes:

Separating the extremes ratio by depth was very helpful and much more productive than looking at a depth-averaged quantity. Particularly the spice profiles seem reasonable to compare with each other, but it seems harder to make comparisons with density because 1) the orders of magnitude are very different and 2) R_s profiles seem much more scattered while R_d has more "order"
Display the one-scale ratio in the same way as the extremes ratio and compare. Do they show similar patterns across depth generally? Maybe try plotting with l=100, l=200, l=300
Plot the seasonal/annual breakdowns for a box. Are the seasonal patterns easier to distinguish when we average over the same area and multiple years?
Compare the MKE and EKE metrics more closely. What is causing the difference between these values, particularly the large jump in the third bin? Are the sums of these methods the same? Are there any negative values in play for MKE specifically?
For boundaries, the general idea is that we will calculate the filtered profile with the whole float, then select the area one filter length away from the boundaries. The two most obvious choices are:
- [0,2000] as the boundary, so the profile examined would be [(0 + l), (2000 - l)]
- [MLD,2000] as the boundary, so the profile examined would be [(MLD + l), (2000 - l)]
I've started to work on a notebook for mixed layer depth calculations, but it still needs some work. I'm in particular wondering how to treat MLD for a box where each profile has its own mixed layer depth. One way to address this is before any averages are performed, use .dropna(dim='PRES_INTERPOLATED') so the mask is based on the deepest MLD present in the region. I guess it just depends if this value is relatively uniform for every profile or not

Next steps:

[x] clean up ratio portion of argo_mult_scale
[x] plot EKE/MKE info with a box instead of a float
[x] careful analysis of EKE vs MKE
[x] create function that adds mask for MLD to dataset of argo profiles
[x] examine boundaries based on above method

andrewfagerheim commented 1 year ago

29 June 2023: Random Updates

Notes;

I was able to calculate EKE and MKE based on individual terms and the u = + u' substitution, and confirm that the variance summed over depth is the same using the EKE and MKE methods by bin for an idealized profile. What I'm now worried about is that doesn't appear true for the argo profiles in the Steinberg Figure 5 plot, so I need to check this.

I made a function to calculate the MLD for every profile and create a Steinberg Figure 3 plot. Now I need to expand that work by changing the 'bound' keyword in EV and filtering functions to set the upper boundary as the bottom of the mixed layer (instead of the surface).

You should also think about how to treat MLD and masking in a box of data. Particularly when feeding a profile to filter.gaussian_filter1d, aren't nans going to be an issue??

After this, it seems like there's a relatively straightforward set of tools for analyzing a region:

plot temperature and spice on a density grid
plot density and spice profiles with MLD
compute ratios and plot over depth
create EKE plots broken down by season and depth

Next steps:

[x] resolve all items listed above
[x] check if EKE and MKE methods have the same variance when summed over scale
[x] use MLD as boundary for mask

andrewfagerheim commented 1 year ago

5 July 2023

Notes:

I think (?) pretty much all of the tweaks/updates to individual methods have been made at this point which is exciting. It seems like I'm at a point with most of those updates where I need to show Dhruv and get his insight/feedback.
Next up I think you should create a new notebook that "processes" a given box with all of the methods we've worked on from start to finish. I think this will help clarify the narrative of our methods and highlight any lingering questions, mistakes, etc that need to be addressed. More specifically, pick a box and:
- plot contours of temperature, salinity, density, and spice
- select a few interesting spice profiles and show filtering and masking techniques
- plot average MLD and profiles by season
- plot average ratio quantities by depth
- plot spectra for whole profile and masks profiles
- plot EKE/MKE binned by both scale and depth respectively (try with whole profile and masked profiles)

Next steps:

[x] pick box and perform analysis sketched above
- [x] make all plots described above
- [x] go back to Ratios, do you need to take masking into effect?
- [x] make notebook pages that walk through definitions/background to the methods
- [x] draft brief write-up on remaining questions/observations from that process
[x] start thinking about what/how to plot these things globally?

andrewfagerheim commented 1 year ago

24 July 2023: Meeting w @dhruvbalwada

Notes:

Generally it seems like are working (with a few tweaks below). One plot seems to show seasonal variation, EKE seems to be a valid metric, the trends we would expect to see in variance/MLD/etc seem to hold true.
The most significant limiting factor seems to be noise; since this is a small box (5x5) there are less than 50 profiles, which is not really enough to produce smooth/consistent outcomes, especially when additionally separating by month or depth. Instead, you should pick 10x10 boxes to capture more profiles.
These new boxes should be in regions where we expect to find distinct?/intact? spatial signals:
- look at the Dove paper to look at regions with high and low EKE, pick a box that largely fits within each "band"
- look at where the Mediterranean flows into the Atlantic, pick a spot just slightly west of this region. Looking for spice variance spike a few hundred meters deep where salty basin water subducts below fresher ocean water
Make sure all elements of analysis use the same upper mask boundary of the MLD. You can chose this MLD as:
- an average of all profiles within each month
- the average within winter (don't love this because winter is subjective based on hemisphere and less significant moving equatorward)
- maybe just deepest MLD of all profiles in a box?
- give each box its own unique mask, but make sure the deepest mask applies to all filter scales
Now the goal is to have a separate notebook for each region, basically a duplicate of the master we looked at today. Then select the few plots that look the most promising, put them all in a few slides to compare the results for each region

Next steps:

[x] create a new notebook for the ACC high EKE box, and:
- [x] duplicate all work from argo_box_anlysis
- [x] update all plots with comments in argo_box_analysis
- [x] pick the most interesting plots, consolidate in slides
[x] repeat these steps for the ACC low EKE box and the North Atlantic gyre box

andrewfagerheim commented 1 year ago

26 July 2023: Meeting w @dhruvbalwada

Notes:

EKE in Box 1 has a very interesting signal where there's a minima around 1000m and then variance increases again with depth. This signal appears in all scale bins within this box, but doesn't not appear as strong in other boxes. (It appears in the average EKE profiles of Box 2 but less prominently. And not at all in Box 3.)
- This could be because we're picking up a mode water subducted in the SO that moved deeper as it moved north
- To better dive into this (and compare to the Sallee analysis), add density contours to the EKE plots to see 1) does this anomaly follow a specific density value (in depth it moves up and down) and 2) does this density favorable compare to what the Sallee schematic would predict?
The ratios might look strange because you're calculating the ratio for each profile, then taking the mean. Instead take the mean of each profile, then calculate the ratio.
For the EKE seasonality plots, they look noisy because there's no averaging performed. Instead, plot the median (quantile=0.50) with shading that encompasses Q1-Q3 (quantiles 0.25 to 0.75) for each scale bin. No need to keep plotting MKE, but keep both the seasonal and annual plots for now.
Again, the methods seem to be picking up interesting trends (both expected and unexpected). Keep expanding the geographic boxes studied; this time include a few more ACC boxes based on the Sallee paper.
The issue loading boxes seems to be that some problem profiles don't have PSAL/PSAL_QC, so there's a problem concatenating the profiles. Will need to chat with Tom about a workaround for this (is it possible to set a flag? or change a keyword?) and implement by cloning the argopy repo.
Should include a few more boxes in the analysis for comparison. Look at the Sallee paper on mode water formation and subduction rates for ideas of locations to pick

Next steps:

[x] ask Tom about the .to_xarray() issue at LEAP tomorrow
[x] fix the ratio plots
[x] add density contours to EKE plots
[x] fix the variance seasonality plots (see notes above)
[x] look into sampling rate (is it really ~4m?)
[x] update slide deck for a more formal walkthrough
- [x] probably pick a few more boxes to include in this?

andrewfagerheim commented 1 year ago

27 July 2023: Quick chat with @dhruvbalwada

Notes:

Pick boxes immediately north and south of Box 1 to see if the sharp turns in variance change depths as you get closer to/farther from regions of subduction.
For the sampling rate issue, 2m --> 5m is fine, but 2m --> 100m is not. Instead of making sure the mean/median is above 4m, figure out some way to mask or label the sample rate changes so it can be considered after interpolation. In other words, after interpolation has occurred, it will be difficult to determine which profiles have which sampling rate, or where the sampling rate changes. So this somehow has to be accounted for before interpolation. But I'm not really sure how all the mechanics of this will work yet.
More specific thoughts on how to deal with sampling rate changes: https://github.com/ocean-transport/argo-intern/issues/12#issuecomment-1655683014

Next steps:

[x] figure out how to fix the lack of PSAL problem when loading boxes
[x] figure out how to address sampling rate changes

andrewfagerheim commented 1 year ago

28 July 2023: Meeting w @dhruvbalwada

Plots:

For EKE of spice plots:
- Sort profiles by density at 1000, which will hopefully arrange in a roughly N-S pattern
- Pick smaller boxes? particularly looking at 4 it looks very jumbled: maybe because it's going in or out of the ACC?
- Add labels for density contours (do they change depth significantly? do the minima change depth with density?)
For EKE with depth and scale plots:
- Add more panels near the surface, remove ones that are deeper down (if there doesn't seem to be a significant signal
- Try a new plot style: make a depth dependent plot where the x axis is time, y axis is depth, and the color is EKE of density or spice. First try plotting this for every profile and every depth, but then try averaging by month and depth bin, and plotting the results as points. I wonder if it would be easier to pick out trends after averaging

Sample rate:

Seems like sampling rate is incredibly variable, and I've noticed rates of everything including 2m, 5m, 10m, and 100m. It seems like my current method is pretty much the worst possible option because I'm setting some kind of sampling rate barrier (median <4m) but still letting in profiles with vastly different rates.
One method is to remove any profile that doesn't have a good sampling rate, but this would lose lots of good data, particularly near surface data in other profiles.
So, instead you should calculate the distance between samples BEFORE interpolation and add this as a variable. Then once the dataset is interpolated, the sampling rate for different types of analysis can be selected after. In other words, if we want to pick the maximum sample rate as 5m, then maybe filter the whole profile and edit the mask so it removes any part of the profile that doesn't meet the 5m rate.

Problem profiles:

First try the simpler fixes and hope one of these might work:
- Try downloading the newest snapshot and see if you can successfully load any of the problem boxes (ie hope the data center recognized and fixed the issue)
- Try using the ftp gmaze said worked for him (but is this going to be very slow?)
Then dive into specifics of argopy/xarray
- clone and locally install argopy repo
- make small change to methods we think we use
- then go into open_mfdataset() or write own loop, check to see if each file has PSAL variable, if not remove it

Next steps:

[x] plotting updates
- [x] fix EKE of spice plots
- [x] fix EKE with depth/scale plots
- [x] try new EKE with depth/scale plot
[x] sample rate updates
- [x] calculate sample rate before interpolation and add as a variable
- [ ] ~~update masking function to remove any region that doesn't meet the sampling rate requirement~~
[x] problem profile fix
- [x] find latest snapshot, download, and test
- [x] try loading box from data centers
- [ ] ~~make changes to argopy/xarray methods~~

andrewfagerheim commented 1 year ago

3 August 2023: Updates

Notes:

I've been working on a few things that I should jot down notes about here:
- Plotted a "section view" using Argo data to better highlight what we're seeing with the EKE minima in the Southern Ocean and south Atlantic. It might be interesting to make more of these around the Southern Ocean to see if there are similar patterns nearby, or if this is somewhat unique.
- Created new plots to look at EKE based on depth by binning and plotting on a colormap instead of plotting lines for individual depths.
- In the process of working on the get_box() function on two kinds of updates: (1) add variables at this stage so you don't have to in each notebook (including month, year, MLD, etc) and (2) change the get_ds_interp() function so that it calculates the sample rate between each level in each profile and returns this in the interpolated dataset instead of picking profiles based on sample rate.
- I have a running slide deck which so far I've been treating as a way to compare results from 3 boxes. I think instead of thinking about it as a way to show results, for now I should treat it as a way to document methods. Keep updating the deck, but don't upload multiple versions of the same plot. Document the plots themselves and what the display, not what that tells us about the region they were taken from
There are a few things I've been thinking about but haven't started yet:
- When I first load the boxes, that seems like a good time to make the dataset attributes prettier. Is there a way to have a separate ("official") name that comes up when a variable is plotted? A way to store and display units? Look into this because it will make the plots look much more professional.
- Creating a function to calculate EKE at different scales, storing each scale as its own variable in a dataset, then saving this data as a netcdf file. This would reduce the number of times I'd need to calculate EKE in each notebook, which takes up both physical space and time to compute.
- Plotting EKE metrics in density space instead of depth space. I think this wouldn't be super difficult, you just need to check over how you create the density dataset in argo_box_loader to see if you've changed anything relevant recently.
- Still this looming question about problem floats and loading data. Hopefully the new sync Dhruv is downloading will resolve this, but I'm not very optimistic.

Next steps:

[x] finish all updates to get_box() including interpolation fix
[ ] ~~look into storing display names and units~~
[x] create function to make EKE dataset
[ ] make density space EKE plots
[x] resolve problem float issue

andrewfagerheim commented 1 year ago

24 August 2023: Updates

Notes:

It looks like before I left, I added the sampling rate fix to the interpolating function which is great
Today I made/finished the EKE function so now you can calculate EKEs once, store them as a netcdf, and then open this in each notebook you want to reference them in (instead of having to run time consuming cells all the time). OH also a note that before I do this, I only select the profiles that have a sampling rate of better than sample_max, which I've currently chosen as 6m.
I'm currently working on loading large sections around the ACC and running north as far as possible before they reach land. It's taking a long time to load the data, so I'm hoping it works. In addition to the EKE plots in depth space, I still need to make these plots in density space (which hopefully shouldn't be difficult, look at the argo_density_space notebook.
This is a small tweak, but Dhruv said to make the TS colorbar discrete instead of continuous
There are a number of admin/writing tasks to catch up on, with deadlines approaching. Please finish these before classes start!!

Next steps:

[x] load sections and plot EKE in depth space
[ ] additionally plot EKE in density space
[x] make T,S,SIG0 plots to accompany (discrete colorbar)
[x] read Cole 2012

Admin tasks:

[x] write CCS grant application draft by MONDAY https://docs.google.com/document/d/1xC0ItjekpcT2ldaAUxOtQrE_tX-ORJ3V_25U-CVzmTY/edit
[x] write Ocean Sciences abstract draft by MONDAY https://docs.google.com/document/d/1Gcfwm36LnLfZNfEUdCDAxlzYGV6LJkv0p-aZ25i7vNs/edit

andrewfagerheim commented 1 year ago

25 August 2023: Meeting w @dhruvbalwada

Notes:

The very low EKE bands (very smooth spice) are seen across many of the sections and often line up with the base of subducting fresh water. Generally, they follow sloping isopycnals down to ~1000m, but in certain sections they jump back up again across isopycnals. (Note that often this is the equatorward extent of the freshwater tongue, where it is able to penetrate farther at higher isopycnals compared to denser water.) Additionally, there are occasional "blobs" (very scientific, I know) of low EKE toward the bottom of a profile, but it's harder to comment on these because their structure is limited by the 1800m masked depth of our profiles.
At this point then we have pinpointed an interesting feature that isn't exactly described elsewhere in the literature, so this is a good time to start drafting a paper. There are a few more components of analysis () but it will probably be easiest to create the paper's story first and create remaining plots as needed.

Next steps (timescale this week!!):

[x] plot neutral density instead of potential density
[x] read Cole 2012 and Klocker 2023
[x] check section near Cole 2012 glider
[x] calculate salinity gradient and plot
[x] get JGR Overleaf template and start to sketch outline

Admin tasks(timescale this week!!):

write CCS grant application draft by MONDAY https://docs.google.com/document/d/1xC0ItjekpcT2ldaAUxOtQrE_tX-ORJ3V_25U-CVzmTY/edit
write Ocean sciences abstract draft by MONDAY https://docs.google.com/document/d/1Gcfwm36LnLfZNfEUdCDAxlzYGV6LJkv0p-aZ25i7vNs/edit

Longer-term steps (timescale this semester):

create global map with depth of EKE minimum (or minima?)
create EKE sections in density space instead of depth
have paper submitted by the end of the year

andrewfagerheim commented 1 year ago

30 August 2023: Updates

Notes:

Dhruv left some good notes of things to check on the Argo slides:
- Calculate spice and salinity gradient on a section that's been filtered at 100m (ie do the fields look very similar?)
- Calculate EKE using spice and salinity as the tracer
- Load a box around the NATRE region (looking at Ferrari 2005) and add slides, the paper shows an EKE reduction around 500-1000m
- Create a scatter plot (or 2D histogram) of the derivative vs EKE
More notes from Argo slides:
- Make a plot like Fig 3 in Ferrari 2005 (T-S plot with density contours) for the sampled profiles and filtered profiles
- Make an idealized example of a salinity field where the first derivative goes to zero at a precise depth. Then compute the EKE to see if it is minimized at the same depth or if there is some offset.
I have a draft of the OS abstract and most of the CCS application filled out, Dhruv is looking over them now

Next steps:

[x] read Klocker 2023 and Ferrari 2005
[x] submit CCS application and OS abstract by the end of this week
[x] work through notes on Argo slides

andrewfagerheim commented 1 year ago

31 August 2023: Meeting w @dhruvbalwada

Notes on Plotting:

For grad vs EKE plots, make one with only the EKE scale log and another with both scales log
For T-S plots, make both 2d histograms (lots of bins and a logarithmic colormap) and scatterplots (where color is determined by latitude)
Additionally figure out how to pick a region where variance is high and plot a few sampled profiles and filtered profiles to see what kinds of signal we're capturing

Notes on Project:

Reread Smith & Ferrari 2009 particularly their normalization method in Appendix A and the application in Appendix B. According to their equations, the horizontal gradient of C' shouldn't be directly related to the vertical derivative of C, which is the opposite of what we see for observed tracers (temperature, salinity, and spice). Diving into their normalization methods will be important.
Their equations are T(x,y,z,t) however, and it might be advantageous to work in terms of isopycnal surfaces though, T(x,y,ρ,t). You should go back through and practice loading sections in density space to make sure all the analysis so far is possible when substituting 'PRES_INTERPOLATED' for 'SIG0'. Check on the function(s) you adapted from Dhruv's glider paper, df.interpolate2density_prof
Also note how in the grad-EKE plots, there is a really strong (even linear?) signal when the gradient is weak, meaning most of what we capture is that signal in EKE. What will be interesting is to see what may be hidden "underneath" this large signal if we use the Smith & Ferrari normalization technique.

Next steps (before going to Lamont next Friday):

[x] submit OS abstract and CCS grant application
[x] apply for LEAP tier 2 membership
[x] finish current updates to Argo slides (grad-EKE, T-S plots, individual profiles?)
[ ] read Ferrari 2005 and reread Smith 2009
[ ] dive back into functions about converting to density space

andrewfagerheim commented 1 year ago

I'm closing this comment and opening a new issue for the 2023-2024 academic year here: https://github.com/ocean-transport/argo-intern/issues/13