How to train preproc on GPU?

Arclabs001 commented 7 years ago

Hi, I have randomly generated a 10M * 400D array to index & search. But when I try to train the preproc with OPQ20_80 and train_preprocessor() exactly same as bench_gpu_1bn.py, the first time I train it, it merely costs 602s. However, when I trained it again, it took more than an hour without finishing and I stopped it. I don't know how long to keep me waiting.

So, can I train it on GPU to accelerate the process?

Another issue is, when I create an IndexIVFPQ index without preproc (i.e., 400D vectors and PQ20), the program returns a error like this:

RuntimeError: Error in void faiss::gpu::GpuIndexIVFPQ::verifySettings() const at GpuIndexIVFPQ.cu:469: Error: 'ivfpqConfig.usePrecomputedTables || IVFPQ::isSupportedNoPrecomputedSubDimSize( this->d / subQuantizers_)' failed: Number of dimensions per sub-quantizer (20) is unsupported with precomputed codes

Is there anything wrong with the dimensions? As 400 % 20 should be zero.

My runtime environment is K40*4.

Thanks.

The codes are here: Codes of Problem 1 Codes of Problem 2

wickedfoo commented 7 years ago

The PCA and OPQ stuff (vector preprocessors) are only implemented on the CPU at present. However, on that script they should only be training on 1000000 vectors, so it shouldn't take too too long. Are you compiling with optimizations? How many CPU cores are on your machine? Also, the 1000000 was chosen for the original 1 billion, you could try decreasing the preprocessing training size to 100000 or so.

As for the second error, the comment in the assertion is wrong, it should be "unsupported with not using precomputed codes". This particular case hasn't been instantiated in the code, and I haven't yet written a generic-sized (but slower) fallback.

To handle 20 dims per code, add a case 20: here:

https://github.com/facebookresearch/faiss/blob/8e3dc6f2b0b736d043370eb1ee23892bcf2838b9/gpu/impl/PQCodeDistances.cu#L497

and a case 20: here

https://github.com/facebookresearch/faiss/blob/8e3dc6f2b0b736d043370eb1ee23892bcf2838b9/gpu/impl/PQScanMultiPassNoPrecomputed.cu#L39

recompile and I think it should work. I'll fix the typo and add the 20 dim case on my end.

Arclabs001 commented 7 years ago

Thanks. Now I reduced the dimension as 256d and it works fine. Thanks for your help

facebookresearch / faiss

How to train preproc on GPU? #172