Closed Jacob-Stevens-Haas closed 2 years ago
Hey Jake, I think the only reason this has sparse functionality is that it was a modification of sklearn.preprocessing.PolynomialFeatures, which has that functionality. I don’t believe there was any specific use case in mind.
On Sun, Jul 3, 2022 at 3:49 PM Jacob Stevens-Haas @.***> wrote:
Hey @kpchamp https://github.com/kpchamp , we were refactoring a lot of pysindy and built some infrastructure to maintain backwards compatibility around sparse inputs. We're looking to provide documentation as to why they're useful, and git blame shows that you probably added sparse functionality? I'm not sure of a time series that is zero in most cases, but maybe a non-dynamical system use case was calculating the LHS of the regression in order to leverage the SINDy optimizers without the time-differentiation step?
-Jake
The relevant docstring to PolynomialLibrary.transform
@x_sequence_or_item def transform(self, x_full): """Transform data to polynomial features. Parameters ---------- x : array-like or CSR/CSC sparse matrix, shape (n_samples, n_features) The data to transform, row by row. Prefer CSR over CSC for sparse input (for speed), but CSC is required if the degree is 4 or higher. If the degree is less than 4 and the input format is CSC, it will be converted to CSR, have its polynomial features generated, then converted back to CSC. If the degree is 2 or 3, the method described in "Leveraging Sparsity to Speed Up Polynomial Feature Expansions of CSR Matrices Using K-Simplex Numbers" by Andrew Nystrom and John Hughes is used, which is much faster than the method used on CSC input. For this reason, a CSC input will be converted to CSR, and the output will be converted back to CSC prior to being returned, hence the preference of CSR. Returns ------- xp : np.ndarray or CSR/CSC sparse matrix, shape (n_samples, n_output_features) The matrix of features, where n_output_features is the number of polynomial features generated from the combination of inputs. """
— Reply to this email directly, view it on GitHub https://github.com/dynamicslab/pysindy/issues/223, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU3SIXAPJOOIKW25AB6IPTVSIKGLANCNFSM52RMJMDA . You are receiving this because you were mentioned.Message ID: @.***>
@Jacob-Stevens-Haas Feel free to delete this part of the code, push the fix to main, and close this out, when you have a good time. :)
Hey @kpchamp , we were refactoring a lot of pysindy and built some infrastructure to maintain backwards compatibility around sparse inputs. We're looking to provide documentation as to why they're useful, and
git blame
shows that you probably added sparse functionality? I'm not sure of a time series that is zero in most cases, but maybe a non-dynamical system use case was calculating the LHS of the regression in order to leverage the SINDy optimizers without the time-differentiation step?-Jake
The relevant docstring to
PolynomialLibrary.transform