rnewson / couchdb-lucene

Enables full-text searching of CouchDB documents using Lucene
Apache License 2.0
769 stars 147 forks source link

java.io.IOException: Search timed out. #278

Open vintzl opened 5 years ago

vintzl commented 5 years ago

Why I get this:

java.io.IOException: Search timed out.
    at com.github.rnewson.couchdb.lucene.DatabaseIndexer$IndexState.blockForLatest(DatabaseIndexer.java:171)
    at com.github.rnewson.couchdb.lucene.DatabaseIndexer$IndexState.borrowReader(DatabaseIndexer.java:87)
    at com.github.rnewson.couchdb.lucene.DatabaseIndexer$IndexState.borrowSearcher(DatabaseIndexer.java:107)
    at com.github.rnewson.couchdb.lucene.DatabaseIndexer.search(DatabaseIndexer.java:466)
    at com.github.rnewson.couchdb.lucene.LuceneServlet.doGetInternal(LuceneServlet.java:193)
    at com.github.rnewson.couchdb.lucene.LuceneServlet.doGet(LuceneServlet.java:171)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1182)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
    at org.eclipse.jetty.server.Server.handle(Server.java:539)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
    at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
    at java.base/java.lang.Thread.run(Thread.java:834)

with this index (formatted here for readability):

function(doc) {
    var ret = new Document();

    function idx(obj) {
        for (var key in obj) {
            switch (typeof obj[key]) {
                case 'object':
                    idx(obj[key]);
                    break;
                case 'function':
                    break;

                default:
                    ret.add(obj[key], { "store": "no" });
                    if (typeof obj[key] == "string") {
                        ret.add(obj[key], { "field": key, "type": "STRING", "store": "no" });
                    }
                    else if (typeof obj[key] == "number") {
                        ret.add(obj[key], { "field": key, "type": "INT", "store": "no" });
                    }

                    else {
                        ret.add(obj[key], { "field": key, "store": "no" });
                    }
                    break;
            }
        }
    }
}

And if i try with &stale=ok the result is empty:

{"q":"*:*","fetch_duration":0,"total_rows":0,"limit":25,"search_duration":0,"etag":"2760286b8c10","skip":0,"rows":[]}
AdrianTute commented 3 years ago

I also identified this behavior while testing with

This IOException respective search timeout occurred since I began to use default_security=everyone of CouchDB 3 respective made my databases public (no role _admin for admins and members anymore). This error is for me re-producible by just change the permissions of a database I want to query afterwards. Example: set admin roles to ["_admin"] and afterwards again to [].

After some debugging of the code of CouchDB-Lucene it became apparent that the root of this timeout seems to be the change sequences of CouchDB. In the DatabaseIndexer in the private method blockForLatest (which is run only if stale != "ok"), there is the comparison between database.getLastSequence() and pending_seq. The method is looping in the while (pending_seq.isEarlierThan(latest) statement until it times out.

Long story short: these values differ. The change sequences of CouchDB behave wrongly and therefore every subsequent search query will timeout.

I have currently a quick fix for that (and I am not happy with it):

Maybe that helps you at least @vintzl. The reason why it worked with "stale=ok" is that it does not enter the while loop. However, the change sequences are still broke and you don't get a result set. However, I am hoping for CouchDB 4 and the new security concept, v3 seems to be an intermediate step introducing new security features..

Maybe you can find a solution for that behavior. I cannot say if it's a general issue of CouchDB or CouchDB-Lucene.