Closed duhaime closed 6 years ago
This feature was evidently reasonable enough to already be implemented! The elasticsearch driver .insert_single_record()
accepts a refresh_after
-- setting to True accomplishes the intended behavior. Thanks again for this great work!
This morning I tried inserting some images into an ElasticSearch database then querying for the inserted images. I was surprised to see that every single image yielded no matches, as I expected queries to at least return the trivial match where an image matches itself:
Eventually I realized that my queries were returning no results because the query seemed to be executing before the image was indexed. To test this hypothesis, I slept a bit between insertion and query, and got the matches I expected.
This behavior surprised me, probably because I'm new to ElasticSearch. To help others who expect synchronous behavior, would there be any interest in adding an optional parameter to the ElasticSearch driver's
insert_single_record
method that allowed users to callself.es.refresh()
to make the newly inserted record searchable? Evidently calling the refresh method makes the records inserted into an ES index searchable (ES team member Nik Everett says this method is automatically called once a second but that's not fast enough for a simple loop like the one above).Right now I'm having each process on my host insert records into a distinct index, then enter a while loop that sleeps until the number of docs in the index equals the number that have been inserted. It would be great if this kind of synchronous behavior could be a part of image-match's ElasticSearch wrapper. I'm happy to submit a PR if that sounds like a reasonable feature!