Use Pytorch to make search faster

This changes the matrix multiplication in the find_best_matches function to use Pytorch. With an available CUDA GPU it can run more than 100 times faster. When there is no GPU available, the Float16 Numpy arrays are converted to Float32 tensors, which also runs faster, due to better hardware support on the CPU (see also).

Below are the runtime performances for the following command:

%time find_best_matches(text_features, photo_features, photo_ids)

Pytorch with Float16 on Colab with GPU:

CPU times: user 8.32 ms, sys: 15.5 ms, total: 23.8 ms
Wall time: 24.4 ms
['wL4Pgswo2hM', '1EaAfoo37cM', '9dnyZgq9aPI']

Pytorch with Float32 on Colab with CPU:

CPU times: user 696 ms, sys: 29.1 ms, total: 725 ms
Wall time: 737 ms
['wL4Pgswo2hM', '1EaAfoo37cM', '9dnyZgq9aPI']

Previous Numpy implementation using Float16:

CPU times: user 3.71 s, sys: 163 ms, total: 3.87 s
Wall time: 3.45 s
['wL4Pgswo2hM', '1EaAfoo37cM', '9dnyZgq9aPI']

haltakov / natural-language-image-search

Use Pytorch to make search faster #16