Closed cstjean closed 8 years ago
Yes, you can use the FixedLatentFeaturesConstraint https://github.com/madeleineudell/LowRankModels.jl/blob/master/src/regularizers.jl#L157 as your regularizer on Y when fitting the second model:
ry = [FixedLatentFeaturesConstraint(Y[i]) for i=1:n]
Sorry that's not yet documented!
On Wed, Apr 20, 2016 at 12:02 PM, Cédric St-Jean notifications@github.com wrote:
Once a model has been fit to a matrix A, is there any way to fit it to another matrix holding Y constant? For example, if factor analysis is part of a pipeline that ends with an SVM classifier, the cross-validation code should learn the feature matrix Y on the training set, and compute the data matrix X on the test set, given Y.
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/madeleineudell/LowRankModels.jl/issues/52
Madeleine Udell Postdoctoral Fellow at the Center for the Mathematics of Information California Institute of Technology https://courses2.cit.cornell.edu/mru8 https://courses2.cit.cornell.edu/mru8 (415) 729-4115
That worked, thanks! For reference:
ry_B = [LowRankModels.FixedLatentFeaturesConstraint(glrm.Y[:, i]) for i=1:size(glrm.Y, 2)]
glrm_B = GLRM(B,losses,rx,ry_B,k);
X_B, Y_B, ch = fit!(glrm_B);
Y_B == glrm.Y # true
I'm looking for libraries to add to ScikitLearn.jl
. Are you interested in supporting the scikit-learn interface? If so, I would make a PR like this.
Yes, I'd be very happy to have LowRankModels included in ScikitLearn.jl. I'm not sure what the best interface would be; some people will want to be able to ask for, say, NMF or PCA or Robust PCA by name, whereas others may want to specify a more nuanced model.
If you want to go ahead and wrap it, I suggest starting your PR from the dataframe-ux branch, which will be merged into master in the next few weeks. There are a few (small) breaking changes to the interface.
On Thu, Apr 21, 2016 at 5:27 AM, Cédric St-Jean notifications@github.com wrote:
That worked, thanks! For reference:
ry_B = [LowRankModels.FixedLatentFeaturesConstraint(glrm.Y[:, i]) for i=1:size(glrm.Y, 2)] glrm_B = GLRM(B,losses,rx,ry_B,k); X_B, Y_B, ch = fit!(glrm_B); Y_B == glrm.Y # true
I'm looking for libraries to add to ScikitLearn.jl. Are you interested in supporting the scikit-learn interface? If so, I would make a PR like this https://github.com/davidavdav/GaussianMixtures.jl/pull/18.
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/madeleineudell/LowRankModels.jl/issues/52#issuecomment-212894784
Madeleine Udell Postdoctoral Fellow at the Center for the Mathematics of Information California Institute of Technology https://courses2.cit.cornell.edu/mru8 https://courses2.cit.cornell.edu/mru8 (415) 729-4115
Scikitlearn needs to store all hyperparameters in the type to support clone
, and GLRM
is missing fit!
's params
.
fit_params
field to the type definition, with a default value that maintains the current behaviour. Then I'll define ScikitLearnBase.fit!(::GLRM, ::Matrix)
, transform(::GLRM, ::Matrix)
etc. I'll also need to add some pure-kwargs constructors, like scikit does.SkGLRM
, PCA
, NNMF
, etc. that each contain a GLRM
objectOption 2 is less intrusive, but it's more types to maintain and tell users about. Any preference? I like option 1 in general, but it's not a great match for your codebase.
I aesthetically prefer keeping the model separate from the algorithmic parameters. So I would prefer making a new type if Scikitlearn needs to store all the hyperparameters in the type. The simplest option is probably to make a new type SkGLRM <: AbstractGLRM. I don't think the code would require too much extra tooling to make all of the GLRM functionality accessible to SkGLRM in that case.
PCA and NNMF need not be extra types; but there could be specialized functions to instantiate SkGLRMs corresponding to those specialized models.
Are there other problems with this approach?
Madeleine
On Sat, Apr 23, 2016 at 4:49 AM, Cédric St-Jean notifications@github.com wrote:
Scikitlearn needs to store all hyperparameters in the type to support clone, and GLRM is missing fit!'s params.
- I can add a fit_params field to the type definition, with a default value that maintains the current behaviour. Then I'll define ScikitLearnBase.fit!(::GLRM, ::Matrix), transform(::GLRM, ::Matrix) etc. I'll also need to add some pure-kwargs constructors, like scikit does http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html .
- Or I can create some brand new types, SkGLRM, PCA, NNMF, etc. that each contain a GLRM object
Option 2 is less intrusive, but it's more types to maintain and tell users about. Any preference? I like option 1 in general, but it's not a great match for your codebase.
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/madeleineudell/LowRankModels.jl/issues/52#issuecomment-213726114
Madeleine Udell Postdoctoral Fellow at the Center for the Mathematics of Information California Institute of Technology https://courses2.cit.cornell.edu/mru8 https://courses2.cit.cornell.edu/mru8 (415) 729-4115
Once a model has been fit to a matrix A, is there any way to fit it to another matrix holding Y constant? For example, if factor analysis is part of a pipeline that ends with an SVM classifier, the cross-validation code should learn the feature matrix Y on the training set, and compute the data matrix X on the test set, given Y.