reportportal / agent-java-cucumber2

Cucumber JVM version [2.0.0; 3.0.0) adapter
Apache License 2.0
1 stars 8 forks source link

Launches left IN PROGRESS when using agent-java-cucumber2:5.1.5 #46

Closed amalbaccar-hub closed 4 months ago

amalbaccar-hub commented 7 months ago

I am having this issue: launches keep showing IN PROGRESS state while they should've completed.

image

This issue isn't always reproducible. We are using this agent: agent-java-cucumber2:5.1.5

I noticed this new release which came as a fix for a similar issue to the one I'm reporting (https://github.com/reportportal/agent-js-webdriverio/issues/53 and https://github.com/reportportal/agent-js-webdriverio/issues/47): https://github.com/reportportal/agent-js-webdriverio/releases/tag/v5.1.0

@AlexGalichenko @DzmitryHumianiuk @HardNorth Probably same fix is needed for cucumber agent as well.

AmsterGet commented 7 months ago

Hi @amalbaccar-hub ! As far as I know, this problem can be caused by various reasons and depends on the agent implementation. Let's move this issue to the corresponding repo -> agent-java-cucumber2.

amalbaccar-hub commented 7 months ago

Hi @AmsterGet :) thank you for your response and for transferring the issue.

Yeah you're right still not able to detect the root cause. Hopefully I get help here soon cuz it's a blocking issue for us.

HardNorth commented 7 months ago

@amalbaccar-hub Any exception in the console? Any weird stuff?

amalbaccar-hub commented 7 months ago

@HardNorth I got this error in service-api log: com.epam.ta.reportportal.exception.ReportPortalException: Launch '53055ed4-a2f2-4236-86d0-2026a42b6c00' not found. Did you use correct Launch ID?

image

Could this be the root cause?

Please tell me if you need other logs. Thanks!

HardNorth commented 7 months ago

@amalbaccar-hub According to your stack trace that happens on call to getLaunchByUuid method: https://github.com/reportportal/service-api/blob/master/src/main/java/com/epam/ta/reportportal/ws/controller/LaunchController.java#L201C24-L201C39

Do you call it somewhere in your code? Since Client method named the same as server one.

amalbaccar-hub commented 7 months ago

@HardNorth yes I have a listener/subscriber to TestRunStarted & TestRunFinished cucumber events where I call getLaunchByUuid method as follow (we need it to show RP launch URL ):

LaunchResource launch = Objects.requireNonNull(rpService.getClient()).getLaunchByUuid(launchUuid).blockingGet();

However the original issue (of launch not finishing) happened even before adding this call.

HardNorth commented 7 months ago

@amalbaccar-hub I would recommend you double-check that you don't fail on these hooks. That definitely can be the case. Just disable them completely and observe. Because failures on hooks is not something Cucumber Agent can handle due to implementation issues.

amalbaccar-hub commented 7 months ago

@HardNorth As as said issue still there even after disabling hooks. In fact I commented code and made a test but launch stuck in IN PROGRESS state

image

However you can notice in the screenshot that the previous launch finished properly. it almost took 3 minutes after executing the last step in feature file. it's quite long wait time.

amalbaccar-hub commented 7 months ago

@HardNorth this the stacktrace I got when I do pause debugger (while it's hanging):

image

HardNorth commented 7 months ago

@amalbaccar-hub It's for execution with several forked process. You can actually disable it with rp.client.join=false, see here: https://github.com/reportportal/client-java?tab=readme-ov-file#multi-process-join-parameters

If it's really the source of the problem I would like to see a stacktrace to fix that.

amalbaccar-hub commented 7 months ago

@HardNorth Great thank you so much it was the source of problem. Here's a stacktrace:

2024-02-09 12:29:49.304 ERROR 1 --- [o-8585-exec-241] .t.r.c.e.r.ReportPortalExceptionResolver : ReportPortalExceptionResolver > Launch '361c70ca-e13f-451d-9c0f-f2758852573c' not found. Did you use correct Launch ID?

com.epam.ta.reportportal.exception.ReportPortalException: Launch '361c70ca-e13f-451d-9c0f-f2758852573c' not found. Did you use correct Launch ID?
        at com.epam.ta.reportportal.core.launch.impl.GetLaunchHandlerImpl.lambda$findLaunch$1(GetLaunchHandlerImpl.java:138)
        at java.base/java.util.Optional.orElseThrow(Optional.java:408)
        at com.epam.ta.reportportal.core.launch.impl.GetLaunchHandlerImpl.findLaunch(GetLaunchHandlerImpl.java:138)
        at com.epam.ta.reportportal.core.launch.impl.GetLaunchHandlerImpl.getLaunch(GetLaunchHandlerImpl.java:129)
        at com.epam.ta.reportportal.core.launch.impl.GetLaunchHandlerImpl$$FastClassBySpringCGLIB$$21f6db8c.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
        at com.epam.ta.reportportal.core.launch.impl.GetLaunchHandlerImpl$$EnhancerBySpringCGLIB$$399587d9.getLaunch(<generated>)
        at com.epam.ta.reportportal.ws.controller.LaunchController.getLaunchByUuid(LaunchController.java:201)
        at com.epam.ta.reportportal.ws.controller.LaunchController$$FastClassBySpringCGLIB$$702dfa6.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:783)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:753)
        at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:388)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:753)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:698)
        at com.epam.ta.reportportal.ws.controller.LaunchController$$EnhancerBySpringCGLIB$$c750aa93.getLaunchByUuid(<generated>)
        at jdk.internal.reflect.GeneratedMethodAccessor675.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1067)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:963)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:626)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.multipart.support.MultipartFilter.doFilterInternal(MultipartFilter.java:125)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:327)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:115)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:81)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:121)
        at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:115)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:126)
        at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:81)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:105)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:149)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.oauth2.provider.authentication.OAuth2AuthenticationProcessingFilter.doFilter(OAuth2AuthenticationProcessingFilter.java:180)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:103)
        at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:89)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.header.HeaderWriterFilter.doHeadersAfter(HeaderWriterFilter.java:90)
        at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:75)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:110)
        at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:80)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:55)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:211)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:183)
        at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:354)
        at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:267)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:96)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:346)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:887)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1684)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.base/java.lang.Thread.run(Thread.java:829)
amalbaccar-hub commented 7 months ago

@HardNorth However now when running a whole test suite I got 2 parallel launches

image

Meanwhile you make a fix, is there a way/workaround to make it reporting to only one launch?

HardNorth commented 7 months ago

@amalbaccar-hub Server error won't help me, since it looks like client issue. I thought you caught something on the client side.

I can guess it is a filesystem read/write permission issue, since client uses it by default for synchronising Launch ID. Also there is a known issue that this mechanism does not work on Windows Subsystem for Linux (WSL). For workaround you can try return back rp.client.join=true and set rp.client.join.mode=SOCKET.

DzmitryHumianiuk commented 7 months ago

@HardNorth might be similar issue https://github.com/reportportal/reportportal/discussions/1250#discussioncomment-8366908

amalbaccar-hub commented 7 months ago

@DzmitryHumianiuk Thank you for your response, I've read the discussion you shared. I agree with your analysis and now I'm trying to check if the launch is present in the database.

I attached the service api log, I see some errors caused by RabbitMQ (I think I got these errors when I used the workaround provided by @HardNorth: rp.client.join=true & rp.client.join.mode=SOCKET and when I also tried to use rerun feature ). Please take a look.

service-api-log.txt

amalbaccar-hub commented 7 months ago

@HardNorth Thank you so much for the workaround ! and sorry for the late reply cuz I was busy with other stuff.

This workaround works when I run in debug mode from IntelliJ IDEA. However, when we run using release JAR file in command line the problem re-appears and the launches are not finished.

I tried to collect some threads dump/stacktrace, please take a look! threads_report.txt

I copied stacktrace while launch is stuck in progress state: stacktrace.txt

HardNorth commented 7 months ago

@amalbaccar-hub What kind of system do you use for release run? And can you share command example?

amalbaccar-hub commented 7 months ago

@HardNorth We use Jenkins to generate releases and our application is a Java CLI app running in JAVA 8 and gradle (windows).

java -Drp.endpoint=http://rp.evf.us/ -Drp.api.key=key_value -Drp.project=superadmin_personal -Drp.launch=smoke -Drp.attributes=smoke;ARC-Build:23.01.01;EX8000;ISAT-1.9.80 -Drp.client.join=true -Drp.client.join.mode=SOCKET -jar jevf-1.9.80.jar path/to/test_suite

amalbaccar-hub commented 7 months ago

@DzmitryHumianiuk Still getting this error although the launch with the mentioned ID(uuid) is created in the database: com.epam.ta.reportportal.exception.ReportPortalException: Launch '3f1b90e2-03a7-44d2-b227-0f2ecffbdd15' not found. Did you use correct Launch ID?

Please check screenshot:

image

So your analysis here https://github.com/reportportal/reportportal/discussions/1250#discussioncomment-8366908 seems to not be the case.

Could you please analyse this issue further? This is really blocking us and I need to resolve ASAP. Thanks!

DzmitryHumianiuk commented 7 months ago

@amalbaccar-hub So, in other words, you're using Jenkins pre-steps to create a launch in ReportPortal via an API call, take the received UUID, and share it with the other jobs or the current job, letting the subsequent executions report into the provided UUID of the launch? Did I get that right?

In this case, I'm starting to think about the processing speed. The thing is, creating a launch is a synchronous operation. And if you've received a response from the server with your UUID, it's possible that it hasn't yet been dropped or saved in the database. And the very first request to save a child element throws an error. Although, again, it seems to me that in such a case, the request for writing should go into a queue for retries.

@amalbaccar-hub, try waiting an additional couple of seconds after creating the launch before handing off the UUID to other processes.

I see from the picture that the object exists in the database. And I'm trying to understand why, at the moment of the create request by the child API, it still considers it not to be there.

amalbaccar-hub commented 6 months ago

@DzmitryHumianiuk No we're just using jenkins for new build generation and we are not using API call to integrate with ReportPortal; instead of that we are integrating RP agent inside our test framework.

@DzmitryHumianiuk I wanted to share my new findings/understanding of the issue. In fact I tried to redeploy with a lower RP version (5.6.3, before that I used version 5.7.4). And I got the same behavior but I noticed in the API service log that it's showing Rabbit messages and exchange with API and that FINISH_LAUNCH wasn't logged. However, seems that all child items was processed as I see Rabbit messages indicating FINISH_TEST for all items. Here's the API service log: api-service-log.txt

From test framework side I'm sure that finishRQ was sent to ReportPortal. you can see in the following screenshot from the stack trace that finishRQ was received by RP:

image

Actually we discussed this stacktrace before with @HardNorth and he noted that the issue is due to a limitation for multi-process launch functionality with WSL and he proposed this workaround: rp.client.join=true and set rp.client.join.mode=SOCKET but unfortunately it's not resolving the problem !

@DzmitryHumianiuk Please do you have any other assumptions/proposals how to fix that. As I said it's a blocker for our team and it makes ReportPortal unusable for us and if I don't fix it ASAP I think we will decide not to use it. Thank you in advance!

HardNorth commented 6 months ago

@amalbaccar-hub Unfortunately, if rp.client.join.mode=SOCKET doesn't work for you there is no options left on Agent side to fix that. It's known limitation of WSL, that it's impossible to obtain master lock on a file. That makes impossible to synchronise primary Launch UUID through file system. I don't really understand why socket option doesn't work for you, but I have too little details to make any investigation on that.

To workaround that you need to implement a special running script yourself. As the first step you need to create a Launch and save its UUID. Then pass it as environment variable RP_RERUN_OF or system property rp.rerun.of along with rp.rerun=true to your tests. See our documentation: https://github.com/reportportal/client-java?tab=readme-ov-file#common-parameters

Maybe I should think of writing an article describing the workaround in details.

amalbaccar-hub commented 6 months ago

@HardNorth Thank you for the workaround, I'll try it and let you know. However, I would appreciate if you write an article and share it with us :) Thanks in advance!

amalbaccar-hub commented 6 months ago

@HardNorth @DzmitryHumianiuk I wanted to share these news with you. In fact I tried to scale up API service by duplicating the value of RP_AMQP_QUEUES env var (it was set with default value 10 ) : - RP_AMQP_QUEUES=20 Here's my docker-compose file: docker-compose.zip

And this is what I get: image

As you can see from the screenshot, the first launch was super fast (1 minute). The second launch tooks 5 minutes and for the third attempt I got Wait for Launch start timed out exception. When I checked the DB, I noticed that the launch was not created (only the first two) and getting this exception in service api log : Launch '2ea7a1b3-cad9-4ca5-b965-bb0bdbb81ac4' not found. Did you use correct Launch ID?

image

I guess this is a performance problem and I probably need to scale up more my API service. Am I wrong? I'm using cloudformation stack for deployment and I can share template if needed.

DzmitryHumianiuk commented 6 months ago

@amalbaccar-hub ping me in the image

i'll open a trial account for you on our saas, to see if this is caused by the instance performance.

amalbaccar-hub commented 6 months ago

@DzmitryHumianiuk Okay sure I'll ping you, thank you so much :)

amalbaccar-hub commented 4 months ago

Sorry forgot to comment, issue was resolved for us. It's a configuration problem; both reporters (step & scenario) were used which causes the issue. So closing the issue. Thanks for the support.