Closed DaniJonesOcean closed 3 years ago
Here is the 50-300m cut, using 10 classes. As expected, it gets further up onto the shelf!
There are two near-Antarctic classes:
It's almost like there's a "gyre" and a "near-coastal" class. Very cool. I'm tempted to try a K=5 model and further classify the near-Antarctic class, as we did before.
The above two correspond to K=4 (gyre) and K=10 (near-Antarctic). The T and S profiles are shown above.
Brilliant!! I like it a lot!!! An additional idea: go to 10 m (if there are enough profiles), and either use more clusters or classify only in the black region in figure. I'm curious to see if there are any more regimes inside the gyre, in the cyan cluster.
Thanks. :) And good idea! Here's what we get with K=5 and a zoomed-in domain:
That's coll! :) Lots going on in the gyre, so distinct from the ACC waters 1) Could the blue cluster represent more winter time profiles? 2) It's interesting that the pink dots are found in the southern edge too 3) The red cluster seems really distinct from the others.
Do you have the T/S diagram color-coded by cluster?
Hi, cool figures! I'm a bit behind.. Could you catch me up? How is the clustering happening?
I think we should settle on one statistical model for the clustering algorithm, meaning a number of K.
@isazar - Not yet. I'll aim to do that soon.
Hi @maikejulie! I'm still just using PCA and GMM so far. We are exploring the effect of changing the depth range. By selecting the 50-300m range, we capture more profiles on and near continental shelves. This requires a new statistical model. Then @isazar made the suggestion that we restrict our attention to the region in the black box in her comment above, to see if we can identify structures in that domain alone. This also requires a new statistical model.
While we are still exploring different domains in lat-lon-depth, there's still lots of room for varying K. It depends a bit on what we're looking for.
At present, there are at least three statistical models at play:
I haven't posted any full reports yet - I've just been putting up plots! I hope this comment helps a little.
One more thing: in general, I'm finding that increasing K leads to big increases in the uncertainty (i-metric). Although BIC and AIC might suggest that we use a larger number of classes in some of these models (e.g. 10), I'm getting better results, in terms of the i-metric, when I stick to smaller class sizes (e.g. 5). In some cases, when I use the BIC-and-AIC suggested values, I get several classes where most of the profiles have large uncertainties.
I'm closing this as well, because I want to refocus these efforts and maybe produce a mini-report instead of collections of figures scattered across various GitHub issues. :)
This should get us closer to the shelf.
Also try using the K=10 whole dataset as a start, and then using more clusters to further split the near-Antarctic class