kestra-io / kestra

:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
https://kestra.io
Apache License 2.0
12.86k stars 1.12k forks source link

Large http head request failed #4658

Open tchiotludo opened 2 years ago

tchiotludo commented 2 years ago

related to https://github.com/micronaut-projects/micronaut-core/issues/6927 ?

      - id: getHeader
        type: io.kestra.plugin.fs.http.Request
        method: HEAD
        uri: "https://bdnb-data.s3.fr-par.scw.cloud/bnb_export_metropole_sql_dump.tar.gz"
vasanthegde commented 2 years ago

We are facing same issue. we are making HEAD call which returns content-length greater than 2GB. We are seeing ContentLengthExceededException at least this exception does not make sense to HEAD call or max-content-length should be type of Long

tchiotludo commented 2 years ago

@vasanthegde I have a feeling that we need to switch to another http client that don't rely on netty 😑 I need to dig in order to find a good one ... Netty is everywhere 😑

uncledata commented 1 year ago

I think i have similar issue:

id: yellow_taxi_flow
namespace: dev

inputs:
  - name: monthYear
    type: STRING
    defaults: 2023-01

tasks:

  - id: download_api_data
    type: io.kestra.plugin.fs.http.Request
    uri: [https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_{{](https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_%7B%7B) inputs.monthYear }}.parquet
    method: GET

  - id: shove_it_to_s3
    type: io.kestra.plugin.aws.s3.Upload
    from: "{{outputs.download_api_data.uri}}"
    key: yellow_taxi/raw/{{ monthYear }}/yellow_taxi_{{ inputs.monthYear }}.parquet'
    bucket: "tomas-data-lake"
    region: eu-central-1
    accessKeyId: "{{envs.aws_access_key_id}}"
    secretKeyId: "{{envs.aws_secret_access_key}}"

triggers:
  - id: schedule
    type: io.kestra.core.models.triggers.types.Schedule
    cron: "0 8 1 * *"
    backfill:
      start: 2023-01-01T00:00:00Z

So i'm quickly running it for one default month. File is around 45MB and I get this neat little error:

2023-07-12T18:43:54.405Z ERROR The received length exceeds the maximum allowed content length [10485760]
2023-07-12T18:43:54.405Z TRACE io.micronaut.http.client.exceptions.ContentLengthExceededException: The received length exceeds the maximum allowed content length [10485760]
    at io.micronaut.http.client.netty.DefaultHttpClient$BaseHttpResponseHandler.exceptionCaught(DefaultHttpClient.java:2038)
    at io.micronaut.http.client.netty.DefaultHttpClient$FullHttpResponseHandler.exceptionCaught(DefaultHttpClient.java:2292)
    at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:346)
    at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:325)
    at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:317)
    at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:143)
    at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:346)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:447)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Unknown Source)
    Suppressed: java.lang.Exception: #block terminated with an error
        at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
        at reactor.core.publisher.Flux.blockFirst(Flux.java:2701)
        at io.micronaut.http.client.netty.DefaultHttpClient$1.exchange(DefaultHttpClient.java:499)
        at io.kestra.plugin.fs.http.Request.run(Request.java:92)
        at io.kestra.plugin.fs.http.Request.run(Request.java:23)
        at io.kestra.core.runners.Worker$WorkerThread.run(Worker.java:635)

Link to slack convo: https://kestra-io.slack.com/archives/C03FQKXRK3K/p1689187621098959

Ben8t commented 1 month ago

I'm still able to reproduce

Image

id: myflowtesqta
namespace: company.team
tasks:

  - id: getHeader
    type: io.kestra.plugin.core.http.Request
    method: HEAD
    uri: "https://www.data.gouv.fr/fr/datasets/r/ad4bb2f6-0f40-46d2-a636-8d2604532f74"

@loicmathieu don't we change something on HTTP this summer that could help solve this issue ?

loicmathieu commented 1 month ago

No, under the hood we still use the Micronaut HTTP Client which uses the Netty HTTP Client which didn't allow HEAD request of more than 2G content-length