vaadin / kubernetes-kit

Other
3 stars 3 forks source link

Vaadin Push fails during session deserialization #111

Open MichaelPluessErni opened 8 months ago

MichaelPluessErni commented 8 months ago

Error:

During session deserialization it can happen, that vaadin pushes fail and trigger a refresh.

Expectation:

Refreshs should not be triggered during session deserialization.

More details:

I suspect that the push process and the deserialization process run in parallel and can interfere.

vaadin_push_fails_during_deserialization_and_causes_refresh

Versions: Flow: 24.3.5 Vaadin: 24.3.4 Spring Boot: 3.2.2 Java: Oracle Corporation 17 OS: amd64 Windows 10 10.0 Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Live reload: Java unavailable (JRebel): Front end active

mcollovati commented 7 months ago

In this case, it appears that the Flow client, for some reason (page reload? Navigating away?), is closing the push connection (X-Atmosphere-Transport=close), not actually receiving updates, but what's weird is that the 405 response code. Currently, I couldn't find any Flow nor Atmosphere code that is sending a 405. Are there any relevant logs on the server related to the failing request?

MichaelPluessErni commented 7 months ago

@mcollovati Unfortunately, I could not find related messages in the logs.

mcollovati commented 7 months ago

Unfortunately, I'm unable to reproduce the PUSH 405 failure. However, I think the PUSH request failure is a consequence of the page reload, not the cause.

For further investigation, it would be good to have the trace of the previous UIDL request (request + response payloads) and, if Spring Security is configured for the application, the server logs with category org.springframework.security at debug level.

mcollovati commented 7 months ago

@MichaelPluessErni additional question: is this happening in production or development mode?

MichaelPluessErni commented 7 months ago

@mcollovati this happens in production mode.

I'll have a look whether I can procure a logging trace as described in your previous post.

mcollovati commented 6 months ago

Testing locally, I had a confirmation that the POST PUSH request is caused by a page reload, not the opposite. On page reload, Flow client closes the atmosphere connection, but it looks like that currently requests to /VAADIN/push are not considered as internal request and this may cause some Spring Security filter to check a request that should not be blocked.

MichaelPluessErni commented 6 months ago

I've been unable to replicate the 405 Error Code for some time. Instead I've observed different behaviour, that might be related:

When having the website open and restarting the pod, sometimes the website stays at it is (desired) and sometimes it reloads and goes back to the login screen (not desired).

In the case where it reloads, I've seen in the last "app/?v-r=uidl&v-uild=1" Request that two Set-Cookie headers were added. In most cases these headers are not present. Might this be related and cause a reload?

pod_restart_website_reload

In cases where there was no reload, an "app/?v-r..." request was not present.

I hope this helps / gives a clue.

mcollovati commented 6 months ago

It would be good to know the contents of the UIDL request. About the cookies, I think they are correct if the server changed. Could you please also check that the request causing the reload has the clusterKey⁣ cookie correctly set? Without it, the session cannot be restored.

MichaelPluessErni commented 6 months ago

The clusterKey is set correctly.

Contents of the UIDL request:

Request headers:

:authority:
XXXX-3154.XXXX
:method:
POST
:path:
/app/?v-r=uidl&v-uiId=1
:scheme:
https
Accept:
*/*
Accept-Encoding:
gzip, deflate, br, zstd
Accept-Language:
de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
Content-Length:
342
Content-Type:
application/json; charset=UTF-8
Cookie:
XXXX-config-point-of-sale=2875; clusterKey=4f090fde-90f8-4526-b91b-acc3591a9501; ingress-nginx-LB-cookie=6dfbdf159c4685fa5a1a066096b6f7a3|e307bd8c560d84365a1cbd65ac4ddd3d; JSESSIONID=18CB36FCF481A01E31D721D9F00A003C
Origin:
https://XXXX-gke-XXXX-3154.XXXX
Sec-Ch-Ua:
"Google Chrome";v="123", "Not:A-Brand";v="8", "Chromium";v="123"
Sec-Ch-Ua-Mobile:
?0
Sec-Ch-Ua-Platform:
"Windows"
Sec-Fetch-Dest:
empty
Sec-Fetch-Mode:
cors
Sec-Fetch-Site:
same-origin
User-Agent:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36

Payload:

{"csrfToken":"73ef5d1f-055e-40f3-8770-1357b60e08d7","rpc":[{"type":"event","node":84,"event":"click","data":{"event.shiftKey":false,"event.metaKey":false,"event.detail":1,"event.ctrlKey":false,"event.clientX":546,"event.clientY":819,"event.altKey":false,"event.button":0,"event.screenY":-534,"event.screenX":1526}}],"syncId":22,"clientId":22}

Interestingly, the reload seems to be triggered by a click on the website. It does not reload of its own beforehand.

mcollovati commented 6 months ago

Could you post also the content of the response?

MichaelPluessErni commented 6 months ago

Example of the response headers of one of these requests where a reload happened:

Cache-Control:
no-cache, no-store, must-revalidate
Content-Encoding:
gzip
Content-Security-Policy:
default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' blob:; connect-src 'self' wss://*.coop.ch https://sdk-api.scandit.com; img-src 'self' *.coop.ch data:; media-src 'self' *.coop.ch data:; style-src 'self' 'unsafe-inline'; font-src data:; manifest-src 'self' *.coop.ch;frame-ancestors 'none'; child-src blob:; worker-src 'self' blob:;
Content-Type:
application/json;charset=UTF-8
Date:
Wed, 17 Apr 2024 14:06:13 GMT
Expires:
0
Pragma:
no-cache
Referrer-Policy:
no-referrer
Set-Cookie:
ingress-nginx-LB-cookie=a3a7c471ec4fa61c0c868bf2a187a403|e307bd8c560d84365a1cbd65ac4ddd3d; Expires=Wed, 17-Apr-24 15:06:13 GMT; Max-Age=3600; Path=/; Secure; HttpOnly
Set-Cookie:
JSESSIONID=5816F213C529418F402BAFE9EC6749EE; Path=/; Secure; HttpOnly
Strict-Transport-Security:
max-age=15724800; includeSubDomains
Vary:
origin,access-control-request-method,access-control-request-headers,accept-encoding
X-Content-Type-Options:
nosniff
X-Frame-Options:
SAMEORIGIN

X-Xss-Protection:
0

(Edit: this is not quite correct) There does not seem to be a response body. Instead the browser debugger shows this message:

grafik

This seems to be expected Chrome behaviour: https://superuser.com/questions/1788537/why-do-chromes-developer-tools-fail-to-collect-data-consistently I'll try to retrieve an example where the data is available.

MichaelPluessErni commented 6 months ago

I was able to retrieve an example where the page was reloaded. Here the issue seemed to have been a session loss.

Situation:

grafik

UIDL request:

{
  "csrfToken":"f2d302ca-9e27-407b-af89-676c263f72ef",
  "rpc":[
    {"type":"mSync","node":2193,"feature":1,"property":"value","value":"une description erreur et le chose plus importante: c'est très impossible"},
    {"type":"event","node":2193,"event":"change","data":{}}
  ],
  "syncId":44,
  "clientId":38
}

UIDL response:

for(;;);[{"meta":{"sessionExpired":true}}]

This triggered a push request:

PUSH request headers:

POST /app/VAADIN/push?v-r=push&v-uiId=1&v-pushId=17d14dc8-b141-4757-923b-95ef41cc67d7&X-Atmosphere-Transport=close&X-Atmosphere-tracking-id=82ff298d-1cf4-4aa4-b66c-b3b88c03c142&X-Vaadin-LastSeenServerSyncId=44 HTTP/2
Host: xxxx-gke-xxxx-3154.xxxx-gkej2ee-test3.hs-gc.xxxx.ch
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Content-Type: text/plain;charset=UTF-8
Content-Length: 0
Origin: null
DNT: 1
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: no-cors
Sec-Fetch-Site: same-origin
Connection: keep-alive
Cookie: xxxx-config-point-of-sale=6018; tws_camp=%7B%22medium%22%3A%22organic%22%2C%22source%22%3A%22google%22%7D; JSESSIONID=92C331B874BC506B18FE39D58FBF143C; clusterKey=298cb9f4-c141-42bf-a435-a8c1ebf1d4a1; ingress-nginx-LB-cookie=1e41174373e4e3a366748c43964349aa|e307bd8c560d84365a1cbd65ac4ddd3d
TE: trailers

PUSH response headers:

HTTP/2 302 
date: Wed, 24 Apr 2024 11:29:10 GMT
content-length: 0
location: https://xxxx-gke-xxxx-3154.xxxx-gkej2ee-test3.hs-gc.xxxx.ch/app/login
vary: Origin
vary: Access-Control-Request-Method
vary: Access-Control-Request-Headers
x-content-type-options: nosniff
x-xss-protection: 0
cache-control: no-cache, no-store, max-age=0, must-revalidate
pragma: no-cache
expires: 0
strict-transport-security: max-age=15724800; includeSubDomains
x-frame-options: SAMEORIGIN
content-security-policy: default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' blob:; connect-src 'self' wss://*.xxxx.ch https://sdk-api.scandit.com; img-src 'self' *.xxxx.ch data:; media-src 'self' *.xxxx.ch data:; style-src 'self' 'unsafe-inline'; font-src data:; manifest-src 'self' *.xxxx.ch;frame-ancestors 'none'; child-src blob:; worker-src 'self' blob:;
referrer-policy: no-referrer
X-Firefox-Spdy: h2

Logs: The serialisation to Redis seems to have worked:

2024-04-24 13:29:14.633
Marking serialization complete for 298cb9f4-c141-42bf-a435-a8c1ebf1d4a1

2024-04-24 13:29:14.633
Session 298cb9f4-c141-42bf-a435-a8c1ebf1d4a1 sent to Redis

2024-04-24 13:29:14.536
Sending session 298cb9f4-c141-42bf-a435-a8c1ebf1d4a1 to Redis

I hope this helps. Let me know if you need anything else.

mcollovati commented 6 months ago

Thanks, @MichaelPluessErni The sessionExpired message is what I expected to see, causing the client to close the push connection.

One thing to notice is that the response to the UIDL request has a set-cookie header for JSESSIONID. But unfortunately, the screenshot does not show all the cookies sent by the client. It would be great to check if the request has a JSESSIONID different from the one in the response.

I'll have to find a way to reproduce the issues constantly and debug the session expiration handlig.

mcollovati commented 4 months ago

This comment on a Flow issue may be related to the session expiration case: https://github.com/vaadin/flow/issues/12640#issuecomment-2151830880

mcollovati commented 14 hours ago

Re-opening because it is not confirmed that the referenced fix resolves this issue