so-wise / weddell_gyre_clusters

Unsupervised classification of Weddell Gyre profiles
MIT License
2 stars 1 forks source link

Try using density levels instead of depth levels #18

Closed DaniJonesOcean closed 3 years ago

DaniJonesOcean commented 3 years ago

Think about how this might work and what the advantages/disadvantages might be.

DaniJonesOcean commented 3 years ago

One constraint: in order to use GMM, there needs to be a value for T&S at every selected density level. So we couldn't apply this method near outcrops. Similarly, we would probably also want to exclude the mixed layer, which can feature seasonal outcrops. That's probably fine.

isazar commented 3 years ago

I think it'll be fine to exclude the ML. Looking forward to seeing what this method shows!

DaniJonesOcean commented 3 years ago

I've made some progress on this, in that I've coded up a notebook that is able to (1) calculate the densities (sig0) using TEOS-10 gsw and (2) linearly interpolate the T and S values onto a set of target density values.

Unfortunately, finding a good set of density levels that are suitable for interpolation is proving to be tricky. I can get an idea for the appropriate range by looking at the histogram, which seems okay. I then discard the sigma0 levels that feature nothing but NaN values, and I then discard the profiles which feature NaN values anywhere throughout the column. So far, this procedure has left me with a very small number of profiles! So, I will need to keep experimenting with the appropriate density target levels.

Below is an example of the potential temperature values interpolated onto sigma surfaces:

hist

Alternatively, we might consider targeting a specific water mass and looking for structures within that water mass. I think that would be exciting.

DaniJonesOcean commented 3 years ago

Right now I'm trying $\sigma_0$. We could try neutral density instead.

maikejulie commented 3 years ago

This is really cool! I think targeting watermasses would be very interesting.

I like how you effectively turned it around, trying to use clustering to find the density levels.

I'm likely being slow, but why do we need to discard anything with a nan?

DaniJonesOcean commented 3 years ago

I haven't done any clustering yet. I'm still experimenting with the right density ranges and numbers of bins to use. :)

As far as I understand, we have to discard NaN values before clustering. Unless there's a fancy new method that I'm not aware of. 🤔

isazar commented 3 years ago

I am also not aware of methods that can deal with NaNs.. But if the NaNs are somewhere in the profile, I interpolate, as long as I have some values above and below the NaNs.. (gosh, does it make sense?! :-/ )

maikejulie commented 3 years ago

What I do with land (NaN equivalent?) is to remove them in the original data, but 'remember' where they were. Do the clustering/exploration, and then put them back afterwards as the original 'gaps'.

Would this work here? I'm probably way off here, not understanding something/being slow!

DaniJonesOcean commented 3 years ago

sa_on_sig0

Here are the salinity profiles for the Weddell Sea, in the range 50-300m. It would be impossible to cluster here, since there is basically no range in density shared across all the profiles. I'm just putting it here for reference, to visualise one of the limitations. I'll try a deeper depth range and will hopefully have more luck finding a shared density range.

DaniJonesOcean commented 3 years ago

Okay! The code is now able to project the profiles onto density surfaces and use the values of T&S on those density surfaces as the independent variables (dimensions) for the clustering analysis. This is neat, but there is one big challenge:

It is difficult (impossible?) to find a range of density surfaces on which we have values throughout the entire domain. This is perhaps unsurprising, as we know that isopycnals outcrop in the SO. As a consequence of this limitation, though, we can only classify certain density ranges. I'll post a few below for $27.0-27.2\sigma_0$:

label_map

If we target instead $27.5-27.75\sigma_0$:

label_map

I'm having trouble getting anything much further south to show up. Between the depth range limitations and the density range limitations, it's challenging to find much. That being said, the above plots are still cool and interesting. I'll keep experimenting with different ranges.

DaniJonesOcean commented 3 years ago

The Weddell Sea density plot suggests that we should have some luck between 100-1000m and $27.5-28.0\sigma_0$. It's worth a try...

DaniJonesOcean commented 3 years ago

This is fun, and it kinda works, but it's very difficult to target specific water masses and regions.

We do have the capability built into the code now, so I can consider this issue closed. It might be a useful feature for some applications.