Closed juliangaal closed 5 years ago
Hi, thanks for your interest !
Looking at the error it seems that you don't have enough data points in your dataset.
Indeed I manage to reproduce your error when I use less than 256 points.
Have a look at faiss pca code:
if n
the number of points is higher or equal than the points dimension, d_in
, then PCAMat
is of dim d_in * d_in
and the condition is True.
However, if n < d_in
(which I assume is your case) then n
needs to be higher than d_out for the condition to be True.
In a nutshell, it will throw the error if n < d_out
, in your case it seems that your dataset size is less than 256.
It is a pure faiss error, so if I am not able to solve your issue you might post on faiss directly.
Hope it helps
Ah, great to hear, thanks a lot. Yes, my first test set is VERY small to just see if the whole pipeline works. Cheers
Hey, I had the same problem with a dataset of 100 pictures. In the code I am using the function sets the pca variable to be 256. You can drop this to lets say 50, and you will solve your problem as I did!
def preprocess_features(npdata, pca=256): #change this to 50 mat = faiss.PCAMatrix (ndim, pca, eigen_power=-0.5) mat.train(npdata)
you must let the number of your dataset n % batch_size == 0 or n % batch_size >= d_out
I'm running into this error with my own dataset:
My environment setup matches your defined dependencies (except cuda10, which may become an issue...?).
Parameters I tested, but resulted in the same errror:
Thanks for your help!