Fix: SSE Watcher should now close properly across multiple reloads

mdwagner commented 1 year ago

Purpose

Fixes #1744

Description

Instead of updating a flag to reload SSE Watchers, it stores each Request Context into a list and pops off each one when needing to reload.

Checklist

[x] - An issue already exists detailing the issue/or feature request that this PR fixes
[x] - All specs are formatted with crystal tool format spec src
[x] - Inline documentation has been added and/or updated
[ ] - Lucky builds on docker with ./script/setup
[ ] - All builds and specs pass on docker with ./script/test

robacarp commented 1 year ago

This seems like a good clean up. Thanks @mdwagner.

The connection pool still grows with each page load but it is cleaned up after a reload event. When navigating around, browsers don't send anything back to the server to notify that the connection is being terminated.

This was the case before, but what I'm seeing here is that about 50% of the time my browser reloads before the server is actually ready to accept requests, which results in a connection error. The cheap and easy solution to this is a sleep of a few hundred millis. The complete solution is probably something more complicated.

robacarp commented 1 year ago

I just ran across this stack trace:

web     | Unhandled exception in spawn: Error while flushing data to the client (HTTP::Server::ClientError)
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/http/server/response.cr:299:9 in 'unbuffered_flush'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/io/buffered.cr:240:5 in 'flush'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/http/server/response.cr:118:7 in 'flush'
web     |   from lib/lucky/tasks/watch.cr:91:9 in 'reload'
web     |   from lib/lucky/tasks/watch.cr:235:7 in 'reload_watcher'
web     |   from lib/lucky/tasks/watch.cr:197:9 in 'reload_or_start_watcher'
web     |   from lib/lucky/tasks/watch.cr:183:9 in 'create_app_processes'
web     |   from lib/lucky/tasks/watch.cr:266:17 in 'start_all_processes'
web     |   from lib/lucky/tasks/watch.cr:170:11 in '->'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/fiber.cr:146:11 in 'run'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/fiber.cr:98:34 in '->'
web     | Caused by: Error writing to socket: Broken pipe (IO::Error)
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/io/evented.cr:82:13 in 'unbuffered_write'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/io/buffered.cr:239:5 in 'flush'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/http/server/response.cr:296:9 in 'unbuffered_flush'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/io/buffered.cr:240:5 in 'flush'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/http/server/response.cr:118:7 in 'flush'
web     |   from lib/lucky/tasks/watch.cr:91:9 in 'reload'
web     |   from lib/lucky/tasks/watch.cr:235:7 in 'reload_watcher'
web     |   from lib/lucky/tasks/watch.cr:197:9 in 'reload_or_start_watcher'
web     |   from lib/lucky/tasks/watch.cr:183:9 in 'create_app_processes'
web     |   from lib/lucky/tasks/watch.cr:266:17 in 'start_all_processes'
web     |   from lib/lucky/tasks/watch.cr:170:11 in '->'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/fiber.cr:146:11 in 'run'
web     |   from /Users/user/.asdf/installs/crystal/1.5.1/src/fiber.cr:98:34 in '->'

mdwagner commented 1 year ago

I don't think there's much we can about the Broken Pipe error, unless we just swallow the error from the output.

From what I can gather, it's not really an error, more like an informational message about the connection being closed before we can write to it. The same thing can happen on a normal HTTP API request.

mdwagner commented 1 year ago

This seems like a good clean up. Thanks @mdwagner.

The connection pool still grows with each page load but it is cleaned up after a reload event. When navigating around, browsers don't send anything back to the server to notify that the connection is being terminated.

This was the case before, but what I'm seeing here is that about 50% of the time my browser reloads before the server is actually ready to accept requests, which results in a connection error. The cheap and easy solution to this is a sleep of a few hundred millis. The complete solution is probably something more complicated.

So, the only thing I can think of that we could try, is by setting a reconnection time (retry) on the SSE: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#fields

I'm not sure what else we can do though.

matthewmcgarvey commented 1 year ago

@mdwagner I'm just checking in on this PR. Do you think this is in a "good enough" state? Just trying not to let this PR get too stale just because it doesn't make everything perfect

mdwagner commented 1 year ago

@matthewmcgarvey Yeah, I think it's good enough. If it continues to be a problem in the future, we could just remove SSE support, but for now I think it works.

matthewmcgarvey commented 1 year ago

Cool, feel free to make the draft ready to review @mdwagner

luckyframework / lucky