Azure / azure-cosmosdb-java

Java Async SDK for SQL API of Azure Cosmos DB
MIT License
54 stars 61 forks source link

Exception in client when cosmosDb response headerSize > 8192 #24

Closed owennewo closed 6 years ago

owennewo commented 6 years ago

By default the io.netty.handler.codec.http.HttpObjectDecoder that you use has a maxHeaderSize of 8092. Headers in response from cosmosDb REST API (e.g. querying documents) can be far bigger than this. Particularly if the response has a continuation header.

In my case a simple select * from events e where e.type='LOGIN' was enough to hit this limit and therefore get the following error.

2200470 [rxnetty-nio-eventloop-3-2] DEBUG com.microsoft.azure.cosmosdb.rx.internal.query.Paginator  - Querying/Reading class com.microsoft.azure.cosmosdb.Documents
2200999 [rxnetty-nio-eventloop-3-2] DEBUG com.microsoft.azure.cosmosdb.rx.internal.ResourceThrottleRetryPolicy  - Operation will NOT be retried. Current attempt 0, Exception: {} 
io.netty.handler.codec.TooLongFrameException: HTTP header is larger than 8192 bytes.
    at io.netty.handler.codec.http.HttpObjectDecoder$HeaderParser.newException(HttpObjectDecoder.java:839)
    at io.netty.handler.codec.http.HttpObjectDecoder$HeaderParser.process(HttpObjectDecoder.java:831)
    at io.netty.buffer.AbstractByteBuf.forEachByteAsc0(AbstractByteBuf.java:1272)
    at io.netty.buffer.AbstractByteBuf.forEachByte(AbstractByteBuf.java:1252)
    at io.netty.handler.codec.http.HttpObjectDecoder$HeaderParser.parse(HttpObjectDecoder.java:803)
    at io.netty.handler.codec.http.HttpObjectDecoder.readHeaders(HttpObjectDecoder.java:603)
    at io.netty.handler.codec.http.HttpObjectDecoder.decode(HttpObjectDecoder.java:227)
    at io.netty.handler.codec.http.HttpClientCodec$Decoder.decode(HttpClientCodec.java:202)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)

The cosmosdb code is calling no-arg constructor on HttpClientPipelineConfigurator() which uses MAX_HEADER_SIZE_DEFAULT=8192

I think your RxDocumentClientBuilder class needs to have its CompositeHttpClientBuilder tweaked to somehow put a larger configuration. The following is untested but might be on the right track:

PipelineConfigurator clientPipelineConfigurator = new PipelineConfiguratorComposite<HttpClientResponse<ByteBuf>, HttpClientRequest<ByteBuf>>(new HttpClientPipelineConfigurator<ByteBuf, ByteBuf>(4096, 32768, true ), new HttpObjectAggregationConfigurator());

      CompositeHttpClientBuilder<ByteBuf, ByteBuf> builder = new CompositeHttpClientBuilder<ByteBuf, ByteBuf>()
              .withSslEngineFactory(new DefaultSSLEngineFactory())
              .withMaxConnections(connectionPolicy.getMaxPoolSize())
              .withIdleConnectionsTimeoutMillis(this.connectionPolicy.getIdleConnectionTimeoutInMillis())
              .pipelineConfigurator(clientPipelineConfigurator);
moderakh commented 6 years ago

Hi @owennewo thank you for reporting this here. I wonder if you captured the response header and what hits the header limit in your case. Is it the continuation token in header response which hits the limit?

owennewo commented 6 years ago

yes - its the continuation header response.

The query is hitting a partitioned cosmosdb. I'm not providing a partition key, instead I've set enableCrossPartitionQuery=true. I'm guessing these types of queries can have large continuation tokens!

moderakh commented 6 years ago

@owennewo the fix is as you suggested, we should set higher header size limit.

Are you interested in sending a Pull Request?

We will merge your contribution to our source code if you do so. :-)

owennewo commented 6 years ago

@moderakh - there is a pull request but its failed due to travice / CI

Caused by: java.lang.IllegalArgumentException: Empty key [ERROR] at javax.crypto.spec.SecretKeySpec.<init>(SecretKeySpec.java:96)

I've manually tested the code.

moderakh commented 6 years ago

@owennewo please ignore the failure that's fine. The reason the travis build is failing for you is for public PRs Travis (due to security reasons) doesn't resolve secret key for integration tests. We will look into that later.

Thanks for reporting the header issue, and your contribution 👍