haifengl / smile

Statistical Machine Intelligence & Learning Engine
https://haifengl.github.io
Other
6.05k stars 1.13k forks source link

Takes more memory for LSH model in NearestNeighborSearch #745

Closed manju22412 closed 1 year ago

manju22412 commented 1 year ago

I am using lsh nearest neighbors, it takes around 86mb for model storage by training with 20000 records. Is there any way to reduce the sze..?

Can anyone please reply on the above issue please..

haifengl commented 1 year ago

Multi probe LSH uses more memory by design. If memory is a concern, try plain LSH.

manju22412 commented 1 year ago

i got this issue when i tried with plain LSH only.. And one more thing, it shows "Input vector sizes are different" sometimes when i try to inferencing the model. kindly mention why its happening..?

haifengl commented 1 year ago

Your inference data vector size is different from training data's.

manju22412 commented 1 year ago

i have used the libsvm format for training and inferencing. And getting output for only few cases and remaining it shows error like "Input vector sizes are different". When i try to pass the value other than 1017881 on below line, same error was displaying..

LSH<double[]> lsh = new LSH<>(x, x, 4.0, 1017881);

haifengl commented 1 year ago

The parameter w (1017881) is nothing to do with your data vector size.

manju22412 commented 1 year ago

but, it is issue now..