choasma / HSIC-bottleneck

The HSIC Bottleneck: Deep Learning without Back-Propagation
https://arxiv.org/abs/1908.01580
MIT License
81 stars 17 forks source link

xvdp dev #5

Closed xvdp closed 4 years ago

xvdp commented 4 years ago

Hi Kurt, I used 3 functions from hsic and noticed a few things

  1. you can allocate much less memory by using in place functions, one shouldnt do that in the model but in the loss function that is ok, because loss is the source of of the gradient it simply copies the value to the grad.
  2. for some reason your kernelmat you cast a new tensor in the -e^(d/v), i removed that - so now if X device is cuda, the device doesnt die. Also, recasting a Float Tensor introduces floating point errors, not significant in this context but i dont know how they would propagate.
  3. I added a linting mute so my vscode doesnt flag torch. or as wrong

    If you want look at the reasoning, tests and timing, I have a jupyter in my dev branch where I tested the changes. https://github.com/xvdp/HSIC-bottleneck/blob/xvdp_dev/jupyter/HSIC_Kernel.ipynb I only changed the 3 functions that i was using distmat() kernelmat() hsic_normalized_cca()

    xvdp commented 4 years ago

    hm, I did the pull request from the wrong branch, adding ,gitignore, jupyter. if you dont want that, reject it and ill make a pull request correctly only with hsic.py