DistrictDataLabs / yellowbrick

Visual analysis and diagnostic tools to facilitate machine learning model selection.
http://www.scikit-yb.org/
Apache License 2.0
4.28k stars 557 forks source link

Enhance PCA Decomposition #476

Open bbengfort opened 6 years ago

bbengfort commented 6 years ago

We should enhance the PCADecomposition visualizer to provide many of the features the Manifold visualizer provides, including things like:

See also #455 as another enhancement that might not be related to this enhancement.

rohit-ganapathy commented 5 years ago

Hey! i'm interested in tackling this.

bbengfort commented 5 years ago

@rohit-ganapathy - that would be great, feel free to open a PR when you're ready for us to take a look.

dnabanita7 commented 5 years ago

Can I start working on this,even if @rohit-ganapathy is assigned?

rebeccabilbro commented 5 years ago

Hello @Naba7 — as we explained last week in response to your questions on #738 and #677, we do not "assign" issues or reserve issues for contributors. Anyone is welcome to submit a PR for a feature or bugfix they work on.

However, given that you already have one PR open that still needs to be completed (#755), have started working on #615, and are new to working on Yellowbrick and still getting to know our API, we would really appreciate if you would focus on getting those first PRs across the finish line before starting anything new.

We appreciate your enthusiasm about contributing to Yellowbrick. One of the most important lessons to learn is that open source is a marathon, not a sprint, so we hope you can be patient and enjoy the journey — we promise Yellowbrick isn't going away!

dnabanita7 commented 5 years ago

It's so exciting and fun. I want to know and learn things quick. So,I am asking questions to get assigned everywhere. Sorry I have noted this now.

On Tue 19 Feb, 2019, 7:48 AM Rebecca Bilbro <notifications@github.com wrote:

Hello @Naba7 https://github.com/Naba7 — as we explained last week in response to your questions on #738 https://github.com/DistrictDataLabs/yellowbrick/issues/738 and #677 https://github.com/DistrictDataLabs/yellowbrick/issues/677, we do not "assign" issues or reserve issues for contributors. Anyone is welcome to submit a PR for a feature or bugfix they work on.

However, given that you already have one PR open that still needs to be completed (#755 https://github.com/DistrictDataLabs/yellowbrick/pull/755), have started working on #615 https://github.com/DistrictDataLabs/yellowbrick/issues/615, and are new to working on Yellowbrick and still getting to know our API, we would really appreciate if you would focus on getting those first PRs across the finish line before starting anything new.

We appreciate your enthusiasm about contributing to Yellowbrick. One of the most important lessons to learn is that open source is a marathon, not a sprint, so we hope you can be patient and enjoy the journey — we promise Yellowbrick isn't going away!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DistrictDataLabs/yellowbrick/issues/476#issuecomment-464952251, or mute the thread https://github.com/notifications/unsubscribe-auth/AeGb9yDVmfn6QhUbUjh1no0PM6763S9eks5vO177gaJpZM4UoE6- .

bbengfort commented 5 years ago

@naresh-bachwani has this issue been fixed by your work this summer?

naresh-bachwani commented 5 years ago

@bbengfort I think that explained variance charts are left! But that will be covered in decomposition, right?

bbengfort commented 5 years ago

@naresh-bachwani ExplainedVariance is separate to this issue. Would you mind ticking the checkboxes above based on your work?

BradKML commented 2 years ago

Can these functions be applied to FastICA in Scikit-Learn (or maybe any ICA)? Also observing https://github.com/DistrictDataLabs/yellowbrick/issues/615 and https://github.com/DistrictDataLabs/yellowbrick/issues/316

bbengfort commented 2 years ago

@BrandonKMLee very possibly, it wouldn't hurt to try. I think what you'd have to do is change the pca_transformer attribute on the PCA visualizer; establishing it as a pipeline similar to the code here: https://github.com/DistrictDataLabs/yellowbrick/blob/develop/yellowbrick/features/pca.py#L184-L189. This would have to be done after initialization before any call to fit or transform. I don't see any place it wouldn't work, unless FastICA or ICA doesn't have required attributes like n_components_.

You could also try passing an initialized FastICA or ICA transformer as the manifold attribute to the Manifold visualizer - this might not give you the same features as ICA, but should give you the projected visualization.

BradKML commented 2 years ago

@bbengfort n_components_in_ for FastICA, but at the same time explained variance could be a problem, as each components are expected to have well-distributed significance instead of being ordered, and also such a function currently does not exist for FastICA.

bbengfort commented 2 years ago

@BrandonKMLee ok, that makes sense so potentially FastICA make not work unless we create a specialized manifold for them.