facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
31.76k stars 3.66k forks source link

Implement timeout for slow functions #3351

Closed mdouze closed 5 months ago

mdouze commented 8 months ago

In Faiss, slow functions periodically call InterruptCallback::check()

https://github.com/facebookresearch/faiss/blob/main/faiss/impl/AuxIndexStructures.h#L152

The callback is arbitrary. It does nothing by default. In python there is an InterruptCallback that checks if the user pressed Ctrl-C.

It would be useful to implement an InterruptCallback that breaks out of a function that lasts too long, eg. a clustering.

mdouze commented 7 months ago
  1. C++ only first
  2. support timeout from python
mdouze commented 7 months ago

For the C++ version it can be done with just one additional test. It requires subclassing InterruptCallback to implement want_interrupt to return true if more than x seconds elapsed since the counter was reset.

Something like:

struct TimeoutCallback: InterruptCallback {
    double t0 = 0; 
    double timeout = 0; 
    void set_timeout(double t) {
        t0 = get_time(); 
        timeout = t; 
    } 

   bool want_interrupt () override {
      if (timeout == 0) {
         return false; 
      }
      if (get_time() - t0 > timeout) {
         timeout = 0;  
         return true; 
      } 
     return false;
   }

};

can be tested by running a very long k-means clustering (see Clustering object).

keeth commented 2 weeks ago

looks like this is a one-per-process global timeout, not compatible with multiple threads calling into FAISS, is that right?