kestra-io / kestra

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
https://kestra.io
Apache License 2.0
7.64k stars 467 forks source link

Webserver goes down on huge log #826

Open aurelienWls opened 1 year ago

aurelienWls commented 1 year ago

Expected Behavior

When we have a flow with a lot of log the webserver should handdle this carrefully

Actual Behaviour

When we have a flow with a lot of log, sometimes the webserver goes down Here is the stacktrace

Error occurred writing stream response: OpenSearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [<http_request>] would be [3035037630/2.8gb], which is larger than the limit of [3026295193/2.8gb], real usage: [3035037320/2.8gb], new bytes reserved: [310/310b], usages [request=0/0b, fielddata=239198/233.5kb, in_flight_requests=310/310b, accounting=5534678/5.2mb]]
org.opensearch.OpenSearchStatusException: OpenSearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [<http_request>] would be [3035037630/2.8gb], which is larger than the limit of [3026295193/2.8gb], real usage: [3035037320/2.8gb], new bytes reserved: [310/310b], usages [request=0/0b, fielddata=239198/233.5kb, in_flight_requests=310/310b, accounting=5534678/5.2mb]]
  at org.opensearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:207)
  at org.opensearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2075)
  at org.opensearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2052)
  at org.opensearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1771)
  at org.opensearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1724)
  at org.opensearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1692)
  at org.opensearch.client.RestHighLevelClient.scroll(RestHighLevelClient.java:1203)
  at io.kestra.repository.elasticsearch.AbstractElasticSearchRepository.scroll(AbstractElasticSearchRepository.java:443)
  at io.kestra.repository.elasticsearch.AbstractElasticSearchRepository.scroll(AbstractElasticSearchRepository.java:422)
  at io.kestra.repository.elasticsearch.ElasticSearchLogRepository.findByExecutionId(ElasticSearchLogRepository.java:100)
  at io.kestra.webserver.controllers.LogController.lambda$follow$3(LogController.java:93)
  at io.reactivex.internal.operators.flowable.FlowableCreate.subscribeActual(FlowableCreate.java:71)
  at io.reactivex.Flowable.subscribe(Flowable.java:14935)
  at io.reactivex.Flowable.subscribe(Flowable.java:14882)
  at io.micronaut.rxjava2.instrument.RxInstrumentedFlowable.subscribeActual(RxInstrumentedFlowable.java:57)
  at io.reactivex.Flowable.subscribe(Flowable.java:14935)
  at io.reactivex.internal.operators.flowable.FlowableDoOnLifecycle.subscribeActual(FlowableDoOnLifecycle.java:38)
  at io.reactivex.Flowable.subscribe(Flowable.java:14935)
  at io.reactivex.internal.operators.flowable.FlowableDoOnEach.subscribeActual(FlowableDoOnEach.java:50)
  at io.reactivex.Flowable.subscribe(Flowable.java:14935)
  at io.reactivex.Flowable.subscribe(Flowable.java:14885)
  at reactor.core.publisher.FluxSource.subscribe(FluxSource.java:67)
  at reactor.core.publisher.InternalFluxOperator.subscribe(InternalFluxOperator.java:62)
  at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.run(FluxSubscribeOn.java:194)
  at io.micronaut.reactive.re...

Steps To Reproduce

Create a flow with huge log. It start to fail with a log of 10Mb - 15Mb

Environment Information

Example flow

No response

brian-mulier-p commented 1 year ago

I guess it is the same issue we discussed about @loicmathieu, the idea would be to truncate logs if they are too large. @aurelienWls, is it causing any trouble without going to the logs tab or the issue occurs only when going into this tab ?