EsperoTech / yaade

Yaade is an open-source, self-hosted, collaborative API development environment.
MIT License
1.59k stars 70 forks source link

Issues while fetching the collections with 100s of requests #200

Closed 88K closed 1 month ago

88K commented 1 month ago

Hello @jonrosner

We have been facing the following issues recently on the Yaade.

  1. There are 100s of requests created on the our Yaade instance - So after getting logged-in with the user "admin" > it takes several minutes to show all the Collections (their requests never gets displayed on the screen), tab needs to be refreshed multiple times also.

  2. The same behaviour is seen with the users where there are multiple Collections created along with 100s of requests. After a login - without refresh we never get a list of Collections and and after several refreshes of the tab...we only see the requests of a particular collection only (all other collections doesn't display any requests).

This has happened across multiple users, the users with less number of collections + requests are not facing any such issues.

Note: we have allocated enough resources to the host machine where this Yaade container is deployed.

Errors/Warnings captured from the container:

Oct 08, 2024 2:05:53 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 2306 ms, time limit is 2000 ms
Oct 08, 2024 2:05:54 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 3305 ms, time limit is 2000 ms
Oct 08, 2024 2:05:55 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 4318 ms, time limit is 2000 ms
Oct 08, 2024 2:05:56 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 5318 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked

Oct 08, 2024 2:05:58 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 7318 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
        at java.base@11.0.24/java.lang.StringCoding.encodeUTF8_UTF16(StringCoding.java:975)
        at java.base@11.0.24/java.lang.StringCoding.encodeUTF8(StringCoding.java:907)
        at java.base@11.0.24/java.lang.StringCoding.encode(StringCoding.java:449)
        at java.base@11.0.24/java.lang.String.getBytes(String.java:964)
        at app//io.vertx.core.buffer.impl.BufferImpl.<init>(BufferImpl.java:84)
        at app//io.vertx.core.buffer.impl.BufferImpl.<init>(BufferImpl.java:88)
        at app//io.vertx.core.buffer.impl.BufferImpl.buffer(BufferImpl.java:50)
        at app//io.vertx.core.buffer.Buffer.buffer(Buffer.java:72)
        at app//io.vertx.core.http.impl.Http1xServerResponse.end(Http1xServerResponse.java:389)
        at app//io.vertx.ext.web.RoutingContext.end(RoutingContext.java:911)
        at app//com.espero.yaade.server.routes.CollectionRoute.getAllCollections(CollectionRoute.kt:31)
        at app//com.espero.yaade.server.Server$restartServer$8.invoke(Server.kt:95)
        at app//com.espero.yaade.server.Server$restartServer$8.invoke(Server.kt:95)
        at app//com.espero.yaade.server.utils.VertxUtilsKt$authorizedCoroutineHandler$1$1.invokeSuspend(VertxUtils.kt:55)
        at app//kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at app//kotlinx.coroutines.internal.DispatchedContinuationKt.resumeCancellableWith(DispatchedContinuation.kt:367)
        at app//kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
        at app//kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable$default(Cancellable.kt:25)
        at app//kotlinx.coroutines.CoroutineStart.invoke(CoroutineStart.kt:110)
        at app//kotlinx.coroutines.AbstractCoroutine.start(AbstractCoroutine.kt:126)
        at app//kotlinx.coroutines.BuildersKt__Builders_commonKt.launch(Builders.common.kt:56)
        at app//kotlinx.coroutines.BuildersKt.launch(Unknown Source)
        at app//kotlinx.coroutines.BuildersKt__Builders_commonKt.launch$default(Builders.common.kt:47)
        at app//kotlinx.coroutines.BuildersKt.launch$default(Unknown Source)
        at app//com.espero.yaade.server.utils.VertxUtilsKt.authorizedCoroutineHandler$lambda-1(VertxUtils.kt:53)
        at app//com.espero.yaade.server.utils.VertxUtilsKt$$Lambda$174/0x000000080032e040.handle(Unknown Source)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:140)
        at app//io.vertx.ext.web.impl.RoutingContextWrapper.next(RoutingContextWrapper.java:200)
        at app//io.vertx.ext.web.validation.impl.ValidationHandlerImpl.lambda$handle$5(ValidationHandlerImpl.java:168)
        at app//io.vertx.ext.web.validation.impl.ValidationHandlerImpl$$Lambda$372/0x00000008004c3040.handle(Unknown Source)
        at app//io.vertx.core.impl.future.SucceededFuture.onComplete(SucceededFuture.java:81)
        at app//io.vertx.ext.web.validation.impl.ValidationHandlerImpl.handle(ValidationHandlerImpl.java:159)
        at app//io.vertx.ext.web.validation.impl.ValidationHandlerImpl.handle(ValidationHandlerImpl.java:18)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:140)
        at app//io.vertx.ext.web.impl.RoutingContextWrapper.next(RoutingContextWrapper.java:200)
        at app//io.vertx.ext.web.handler.impl.ResponseContentTypeHandlerImpl.handle(ResponseContentTypeHandlerImpl.java:54)
        at app//io.vertx.ext.web.handler.impl.ResponseContentTypeHandlerImpl.handle(ResponseContentTypeHandlerImpl.java:28)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:177)
        at app//io.vertx.ext.web.impl.RoutingContextWrapper.next(RoutingContextWrapper.java:200)
        at app//io.vertx.ext.web.handler.impl.LoggerHandlerImpl.handle(LoggerHandlerImpl.java:189)
        at app//io.vertx.ext.web.handler.impl.LoggerHandlerImpl.handle(LoggerHandlerImpl.java:48)
        at app//com.espero.yaade.server.Server.restartServer$lambda-0(Server.kt:76)
        at app//com.espero.yaade.server.Server$$Lambda$171/0x000000080032f040.handle(Unknown Source)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:140)
        at app//io.vertx.ext.web.impl.RoutingContextWrapper.next(RoutingContextWrapper.java:200)
        at app//io.vertx.ext.web.handler.impl.BodyHandlerImpl.handle(BodyHandlerImpl.java:95)
        at app//io.vertx.ext.web.handler.impl.BodyHandlerImpl.handle(BodyHandlerImpl.java:45)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:140)
        at app//io.vertx.ext.web.impl.RoutingContextWrapper.next(RoutingContextWrapper.java:200)
        at app//io.vertx.ext.web.handler.impl.SessionHandlerImpl.lambda$handle$7(SessionHandlerImpl.java:316)
        at app//io.vertx.ext.web.handler.impl.SessionHandlerImpl$$Lambda$371/0x00000008004c3c40.handle(Unknown Source)
        at app//io.vertx.core.impl.future.FutureImpl$1.onSuccess(FutureImpl.java:91)
        at app//io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:66)
        at app//io.vertx.core.impl.future.FutureImpl.addListener(FutureImpl.java:231)
        at app//io.vertx.core.impl.future.FutureImpl.onSuccess(FutureImpl.java:87)
        at app//io.vertx.ext.web.handler.impl.SessionHandlerImpl.handle(SessionHandlerImpl.java:290)
        at app//io.vertx.ext.web.handler.impl.SessionHandlerImpl.handle(SessionHandlerImpl.java:37)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:177)
        at app//io.vertx.ext.web.impl.RoutingContextWrapper.next(RoutingContextWrapper.java:200)
        at app//io.vertx.ext.web.impl.RouterImpl.handleContext(RouterImpl.java:250)
        at app//io.vertx.ext.web.impl.RouteImpl$$Lambda$229/0x0000000800357440.handle(Unknown Source)
        at app//io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1285)
        at app//io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:177)
        at app//io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:143)
        at app//io.vertx.ext.web.impl.RouterImpl.handle(RouterImpl.java:68)
        at app//io.vertx.ext.web.impl.RouterImpl.handle(RouterImpl.java:37)
        at app//io.vertx.core.http.impl.Http1xServerRequestHandler.handle(Http1xServerRequestHandler.java:67)
        at app//io.vertx.core.http.impl.Http1xServerRequestHandler.handle(Http1xServerRequestHandler.java:30)
        at app//io.vertx.core.impl.ContextImpl.emit(ContextImpl.java:335)
        at app//io.vertx.core.impl.DuplicatedContext.emit(DuplicatedContext.java:176)
        at app//io.vertx.core.http.impl.Http1xServerConnection.handleMessage(Http1xServerConnection.java:174)
        at app//io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:159)
        at app//io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:153)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at app//io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at app//io.netty.handler.codec.http.websocketx.extensions.WebSocketServerExtensionHandler.onHttpRequestChannelRead(WebSocketServerExtensionHandler.java:160)
        at app//io.netty.handler.codec.http.websocketx.extensions.WebSocketServerExtensionHandler.channelRead(WebSocketServerExtensionHandler.java:83)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at app//io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
        at app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at app//io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at app//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at app//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at app//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at app//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
jonrosner commented 1 month ago

Thank you raising this issue. It is hard to debug this externally. Do you have access to the container directly? If so, can you check the size of the yaade docker-volume? E.g. using docker system df -v and searching for yaade.

Can you also post the specs of your machine? Also underlying hardware, especially disk-types. If it is some rented v-server maybe I can try it myself.

For context: I have currently an instance running with > 5k requests on the smallest Hetzner machine and it loads in under a second.

88K commented 1 month ago

Yaade docker-volume size is around 325 MB.

Machine specs:

AWS r5.large (2 vCPU and 16 GB RAM)

jonrosner commented 1 month ago

325 MB is pretty big for a Yaade database. My guess would be that this is around 10k requests. Or do your requests have very big payloads or responses? If you are able to load the collections you could go to the last requests created and check its ID. It will roughly correlate with the number of requests.

There are a few things that could go wrong in this size, like the container running out of heap space. Are there any limits on the container when it comes to memory or does it have access to the full 16 GB?

88K commented 1 month ago

I appreciate your help.

The latest created request is having an ID 2476. Yes, there are many requests with very big payloads + responses.

Just for the testing purpose, now I have switched to c6a.2xlarge (8 vCPU and 16 GB RAM) but there is no significant improvement.

There is only Yaade container running on the AWS EC2 instance so all resources can be used by the container without any restriction.

88K commented 1 month ago

@jonrosner - Thanks for your time.

As the machine itself is not showing any signs of resources throttling I will check further with different network conditions. i.e. with load balancer, direct connection over private IP address bypassing the load balancer.

jonrosner commented 1 month ago

@88K I see that you have closed this issue. Did you find a solution to your problem?