basho / riak

Riak is a decentralized datastore from Basho Technologies.
http://docs.basho.com
Apache License 2.0
3.95k stars 537 forks source link

Riak KV. Infinite index cycle for string fields are above 32k. [JIRA: RIAK-3278] #895

Open avsevolodov opened 7 years ago

avsevolodov commented 7 years ago

Riak KV v 2.2, ubuntu 14, cluster size - 3, enable search = on. riak.conf is default, but:

I have a custom index scheme with dynamic configuration for the field IndexingFields_map.alltext_s_set:

Any document has an error - size of value in this field greate 32k limit. Riak put operation is ok (update map) for this document. But Riak can't index this value in Solr, and every few seconds retry index it in solr. This is infinite cycle, without any other modifications for this document.

2017-02-20 08:10:48,757 [ERROR] @SolrException.java:142 null:org.apache.solr.common.SolrException: Exception writing document id 1mapsXXXXX*23 to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:169) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:952) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:692) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:141) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:106) at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:68) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:99) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1976) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Document contains at least one immense term in field="IndexingFields_map.alltext_s_set" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[60, 116, 97, 98, 108, 101, 62, 32, 60, 116, 114, 62, 32, 60, 116, 100, 32, 97, 108, 105, 103, 110, 61, 34, 108, 101, 102, 116, 34, 62]...', original message: bytes can be at most 32766 in length; got 58635 at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:687) at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:241) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:465) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1526) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) ... 39 more Caused by: org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 58635 at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:284) at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:151) at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:663) ... 46 more