Open karlnapf opened 8 years ago
@karlnapf : Seems like the perfect task for me. I will do this. I saw the kNN notebook in the docs. In it, it only does a speed comparison between KNN_BRUTE
and KNN_COVER_TREE
. I am assuming you also want me to add a speed comparison between KNN_KDTREE
and the rest (There are only 3 solvers that I see in KNN) ? Is that what is required here ?
Also, when you meant the kNN notebook that would be the IPython notebook here right ?
Thanks.
All yes :) Generally dop more comparisons, and make the notebook nicer overall. You can browse the web a bit for more inspiration of where KNN is useful and where a fast KNN is useful
@karlnapf : I did investigate usage of KNN_KDTREE
, after some Googling I found that it is faster than the others only on low dimensional data. In fact when I ran the tests on the usps
dataset, The results were -
Standard KNN took 1.9s
Covertree KNN took 1.0s
KDTree KNN took 3.7s
How do I make changes to the notebook (the multiclass KNN one in doc folder) ? It has a very weird format and I do not know how to add the code to it.
Do you also want me add a plot for the KNN_KDTREE
as well like the rest ?
Thanks.
You directly can open the notebook in your browser. Google for ipython notebook and how to use it. Then you send a standard patch on via github pull request. Please clear all output of the notebook before sending it. And also please provide an html link to a fully rendered version (say on gist)
@karlnapf : Here is the gist. It will open as an ipython notebook. I have added some text and the code for time comparison for KDTREE
. Let me know what else you want.
P.S. The changes I made are in the Accelerating KNN
subsection. Also, you would want the PR to contain the ipython (.ipynb) file right ?
Can we moive the discussion about your changes into a PR?
Sure, I will submit the ipynb
file and we can edit them as we go. Assuming you want me to commit the ipynb
file. Right ?
Yes
@karlnapf : It says conflicting file. I replaced the KNN.ipynb
file in /doc/ipython-notebooks/multiclass
. I assumed that was the right thing to do.
Google on how to submit pull requests on github
@karlnapf : I submitted a new PR (the old one was getting too cumbersome).
BTW you can always update old PRs, no need to open new ones allt he time
@karlnapf @sudk1896 , I would like to keep working on this issue, due to I just working on the KNN refactoring job. Is that ok?
@MikeLing: Go ahead. I'm not working on it anymore.
Hi @karlnapf , I create a gist[1] for this issue, please tell me if I need to address it. BTW, I found the KD-Tree is more slower than cover tree and plaint knn, and plain knn. Is that ok to put it into the "Accelerating KNN" session?
[1] https://gist.github.com/MikeLing/6c08b1d5eaebc385c2a63fa1314b13b1
Hi @MikeLing It is hard to see from a gist what you changed/added. Can you send a PR, it will show me the diff. For that, pls remove the output of the notebook. Then in addition post a gist like the one above, so I can see how it looks. Then we can discuss in the PR
@karlnapf here is the PR https://github.com/shogun-toolbox/shogun/pull/3620, please give me some feedback, thank you! :)
Hi , I am interested in working on this issue , as entrance task for GSOC 19 , do I have a green light ?
@glitch401 you can work on any issue of your preference, no need to ask for our permission. the best course of action is that you send in a PR as soon as possible - doesn't need to be the final solution, and now even can send in a PR draft....
Hi! @vigsterkr , sorry for a delay in communication. I was going through the previous comments on this issue.
Time and Arrucary comparison for: KNN_BRUTE
, KNN_COVER_TREE
and KNN_KDTREE
has already been worked on and has been patched in PR #3620 .
So what should I work on in this issue?
Put in speed comparisons with all the different solvers that Shogun has. Nice and friendly easy entrance task