Closed nperraud closed 5 years ago
@mdeff, could you check that I did everything correctly and accept the pull request? Thanks
Thanks for your contribution @nperraud.
Two remarks:
learning.py
.N=2000
. If they find it slow, they should enable FLANN, knowing the downsides.Thanks for making unit tests! :)
Thank you for this PR @nperraud , buit would it be possible to interface pyGSP with broader machine learning modules such as scikit-learn or theano? I mean, would it require a lot of changes in pyGSP to do that, what do you guys think?
What do you want to do? Most of the data in the pygsp is numpy arrays, which you can pass around to other libraries. For one, in my research I use TensorFlow/Theano/pyTorch to learn graph filters, which I then plug back in the pygsp for visualization and analysis.
@lrq3000 I just discovered you were an author of tqdm. Thanks for this, I use it all the time! :)
Interfacing with scikit-learn would probably be possible. One way for example is that the pygsp function could return a scikit-learn object.
I am more concerned about the algorithm itself since you cannot really implement:
object.fit(Xtrain)
object.predict(Xtest)
You have to do something like:
object.predict(Xtrain, Xtest)
Alternatively, the fit function could only store the training dataset and all computation would be done in the predict function.
Using a kernel (and hence a different algorithm), would solve this problem but would be way slower in general and would require significantly more coding. What do you think?
@mdeff, so what is the way to go here?
I was merging SensorLarge
and Sensor
, which really are the same graph model. Then we should be able to merge.
I think it is time to finish this pull request. To make it simple, I will not make the .fit
, .predict
interface. We can always do that later on.
I will try to fix the conflicts an tests soon.
All tests are probably passing but I still need to add the large sensor graph.
@mdeff, Everything should be ready to merge. I did not merge sensor and sensorlarge because they still have different functionalities. Instead I renamed sensor as NNSensor. If you tell me, I will just remove the sensor and rename NNSensor as sensor, but I wait for your approval. I happy because I increased the coverage...
@mdeff, could you tell me what is the problem with the build?
@nperraud I had this test failure as well. I think it is due to too many open matplotlib figures. No idea why that is a problem with python 3.6 only. I fixed it (merely a hack) in fe31116, but it reappeared here. Are you creating new figures?
@mdeff Yes, I am creating new figures in the doc of the learning module. Should we merge?
Now that I've found the build issue I'll fix it in master before merging, ;)
Is there no way to merge the two sensor graphs?
Honestly, if it was only up to me, I would just drop the old sensor graph and keep only my new implementation. But that would mean that the sensor function would behave differently with the new version. I do not believe that the options available in the sensor graph are that important.
The build issue was fixed in baf342a6aa31b54b544db87e30f518303a53f5b3 (#38).
I much prefer your implementation and agree to drop Sensor
and rename NNSensor
to Sensor
. The missing options regular
can be implemented in NNGraph
, and connected
in Sensor
, if somebody needs them later.
My only worry is that the weight distribution is quite different in the two implementations. Is that due to the scaling of the Gaussian, or something else?
Any reason for k=6
by default? NNGraph
sets k=10
.
Indeed the weight distribution is quite different. This is why I did not want to erase the old graph without proper approval. It is due to the number of neighbors, the fact that the new one is not a radius graph anymore and to the weighting of the Gaussian. I cannot say if one is better than the other. It will depend on the application.
k=6
was selected because I thought that a grid had 4 neighbors, so the number should be around that. Then I plotted a few graphs and 6 was kind of more beautiful IMHO. :-)
10 is a bit too connected, I think... But again, it depends on the application.
Finally merged! Sorry for the delay... Many thanks for your work @nperraud. Especially for your efforts in unit tests and documentation. :)
I've done the following before merging:
learning
module and its tests.I think that learning.py
and optimization.py
should be merged. Maybe optimization.py
is such in a poor state that it can be removed entirely...