Closed BSchilperoort closed 2 years ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Awesome! I'm just wondering if there's an easy way to extract clusters for a certain lag after applying RGDR, e.g. such that you could do:
clustered_data.sel(lag=1)
. I realize that not all lags have the same number of clusters, so it's not as easy as stacking them along "lag" dimension though. Unless we just fill them with NaNs... What do you think?
As you said, not all lags have the same number of clusters, and additionally, the clusters sharing a label does not mean they represent the same physical regions. I feel like making the cluster labels a dimension along with lag would kind-of imply that.
If we want to support this kind of selection we could create a utility function, but I think that the current way of flattening is required to be able to continue with fitting a model, or to be able to put RGDR in a pipeline.
clusters sharing a label does not mean they represent the same physical regions. I feel like making the cluster labels a dimension along with lag would kind-of imply that
that's a convincing point.
If we want to support this kind of selection we could create a utility function
I agree. Let's see if there's demand for that.
Kudos, SonarCloud Quality Gate passed!
This PR adds support for providing multiple lags to RGDR.
Example:
Note: when plotting data, the user needs to provide the lag they want to see (unless there is only a single lag).
Additionally, I refactored the DBSCAN implementation into more manageable chunks.