keroro824 / HashingDeepLearning

Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"
MIT License
1.07k stars 169 forks source link

DensifiedWtaHash and LSH questions #45

Closed rajesh-krishnan closed 10 months ago

rajesh-krishnan commented 11 months ago

Dear authors, thanks for this wonderful work. I think the following may be bugs in the code (or in my understanding) and I will appreciate a response.

* LSH.cpp:82 => why log(binsize) and not log2(binsize). It looks like we want K * floor(log2(binsize)) == RangePow 
   -- as the comment on DensifiedWtaHash.cpp:60 seems to indicate
   -- this (i.e., binary log) seems to be the intent in the shift logic in line 82 as well  
* DensifiedWtaHash:97-99,150-152 => should it be _numhashes or _numhashes -1? 
   -- Lines 101 and 154 would cause a memory violation.
*  DensifiedWtaHash:102 => why 100 and is it safe to break as `next` and `hashArray[i]` would be INT_MIN 
   -- which causes a serious problem later in LSH.cpp:82, where with zero shift 0x80000000 > (1<<RangePow) - 1
   --  a few steps later this can cause a segmentation violation in LSH.cpp:130
         o (this issue is easier to see in the OptAdd version in LNS which removes the needless saving of indices)