Closed GaelVaroquaux closed 10 years ago
What exactly did he say? There are two ways to go: doing patch-based feature extraction on CIFAR using either k-means or sparse coding, or using computer vision features from skimage and doing Pascal VOC. Well, third option is to use precomputed features for imagenet and then do SGDClassifier but that doesn't really show anything except adding a dataset.
Could you please copy the content of the review?
Could you please copy the content of the review?
It's checked in the repo.
I might just be blind but I don't see it.
I might just be blind but I don't see it.
Indeed. It seems that pushing over airport Wifi isn't very reliable :). I think that it should be good now. Sorry for the noise
Shall we make a formal issue out of each point raised?
I do not understand the comment. I think there is a misunderstanding about what sparse means. The article says it represents bow as a sparse matrix, which is the only way you can do it. The review says sparse coding the bow makes no sense. There is no sparse coding involved here, right?
The original suggestion (if I read things correctly) is "use alternate feature extractors (e.g. maybe add a sparse coder in front of the TFIDF" and then the reviewer withdraws that after Gaël's response. Right?
Ah, I didn't read it correctly (i.e. not from bottom to top). But adding another modality doesn't really add to the "mix and match" feeling. I'd rather substitute a classifier with another.
Yes, I think that you guys are right: the reviewer was probably confused. He wanted some sparse, and thus that draw me on the line of vision. If it is not easy to add a vision example, maybe we should give up on that.
I have already addressed the point 4. Point 1 has been decided that we wouldn't do. That leaves us with point 2 and 3 (code examples). I wanted to address them ASAP, but... Things got in the way. If any one of you want to start on them, I'd be most obliged.
I was actually just trying to add NMF features to the movie review classifier, maybe those will do something interesting.
Hmm, the main thing it does is slow things down. TruncatedSVD is easy to fit in but requires an extra import and doesn't improve anything. Let's leave the NLP example as it is.
The fact that it's easy to check that SVD-projected BoW feature don't improve the accuracy of the overall text classifier is interesting IMO but as you wish.
Well, to do a proper check you'd also have to re-tune all the parameters, e.g. C=5
works much better than C=5000
. Explaining that takes up space. Maybe we can offer some extra (commented-out) functionality in the downloadable script, to give the reader something to play with.
Hi co-authors,
I got feedback from the editor. He asked if we could do a simple, but not too much, computer vision example. In particular he mentioned sparse coding.
Does anybody has an idea or even better, a backbone of code, to sketch such an example? I don't usually do vision, so I don't know on which datasets such an example would apply well. I wouldn't frown upon combining scikit-learn with scikit-image to do some feature extraction here.