haifengl / smile

Statistical Machine Intelligence & Learning Engine
https://haifengl.github.io
Other
5.97k stars 1.13k forks source link

Takes more memory for LSH model in NearestNeighborSearch #745

Closed manju22412 closed 10 months ago

manju22412 commented 11 months ago

I am using lsh nearest neighbors, it takes around 86mb for model storage by training with 20000 records. Is there any way to reduce the sze..?

Can anyone please reply on the above issue please..

haifengl commented 11 months ago

Multi probe LSH uses more memory by design. If memory is a concern, try plain LSH.

manju22412 commented 11 months ago

i got this issue when i tried with plain LSH only.. And one more thing, it shows "Input vector sizes are different" sometimes when i try to inferencing the model. kindly mention why its happening..?

haifengl commented 10 months ago

Your inference data vector size is different from training data's.

manju22412 commented 10 months ago

i have used the libsvm format for training and inferencing. And getting output for only few cases and remaining it shows error like "Input vector sizes are different". When i try to pass the value other than 1017881 on below line, same error was displaying..

LSH<double[]> lsh = new LSH<>(x, x, 4.0, 1017881);

haifengl commented 10 months ago

The parameter w (1017881) is nothing to do with your data vector size.

manju22412 commented 10 months ago

but, it is issue now..