Closed guillego closed 1 year ago
Does the server you're testing against use any websocket extensions, for example per-message-deflate? (Maybe something changed in :zlib
?) I haven't been able to reproduce this yet against the slipstream testsuite running on OTP26
Thanks for taking a look at it! I was testing against a fresh phoenix project with just a generated socket and channel, this is the Mix file:
defmodule ChannelServer.MixProject do
use Mix.Project
def project do
[
app: :channel_server,
version: "0.1.0",
elixir: "~> 1.14",
elixirc_paths: elixirc_paths(Mix.env()),
start_permanent: Mix.env() == :prod,
aliases: aliases(),
deps: deps()
]
end
# Configuration for the OTP application.
#
# Type `mix help compile.app` for more information.
def application do
[
mod: {ChannelServer.Application, []},
extra_applications: [:logger, :runtime_tools]
]
end
# Specifies which paths to compile per environment.
defp elixirc_paths(:test), do: ["lib", "test/support"]
defp elixirc_paths(_), do: ["lib"]
# Specifies your project dependencies.
#
# Type `mix help deps` for examples and options.
defp deps do
[
{:phoenix, "~> 1.7.6"},
{:phoenix_html, "~> 3.3"},
{:phoenix_live_reload, "~> 1.2", only: :dev},
{:phoenix_live_view, "~> 0.19.0"},
{:floki, ">= 0.30.0", only: :test},
{:phoenix_live_dashboard, "~> 0.8.0"},
{:esbuild, "~> 0.7", runtime: Mix.env() == :dev},
{:tailwind, "~> 0.2.0", runtime: Mix.env() == :dev},
{:swoosh, "~> 1.3"},
{:finch, "~> 0.13"},
{:telemetry_metrics, "~> 0.6"},
{:telemetry_poller, "~> 1.0"},
{:gettext, "~> 0.20"},
{:jason, "~> 1.2"},
{:plug_cowboy, "~> 2.5"}
]
end
# Aliases are shortcuts or tasks specific to the current project.
# For example, to install project dependencies and perform other setup tasks, run:
#
# $ mix setup
#
# See the documentation for `Mix` for more info on aliases.
defp aliases do
[
setup: ["deps.get", "assets.setup", "assets.build"],
"assets.setup": ["tailwind.install --if-missing", "esbuild.install --if-missing"],
"assets.build": ["tailwind default", "esbuild default"],
"assets.deploy": ["tailwind default --minify", "esbuild default --minify", "phx.digest"]
]
end
end
And .tool-versions:
erlang 25.3.2.2
elixir 1.15.0-otp-25
Hi @the-mikedavis! I recorded a little terminal session to demo the problem: https://asciinema.org/a/XoY10nHaTYAI5548QyMrRgRJC
You can also find the phoenix test server that I'm using in this repo.
Let me know if there's anything else I can help with to try to find the issue!
Hmm strangely I can't reproduce with https://github.com/guillego/slipstream_client and https://github.com/guillego/phoenix_channels_server with
erlang 26.0.1
elixir 1.15.0-otp-26
and the same steps as the asciicast. I'll tinker with the projects some more but could you make a packet capture with something like wireshark? If you have wireshark installed you can use tshark -i lo -w capture.pcap
That's so strange! :/ I tried reinstalling erlang/otp 26 but got the same error. Here is the network capture
The error message is kind of breaking my brain actually 😅
That malformed_reserved
can only come from this block: https://github.com/elixir-mint/mint_web_socket/blob/f44a35ef7883008d435091c96e06a41d60849601/lib/mint/web_socket/frame.ex#L378-L384 which is guarded on reserved != <<0::size(3)>>
but the error message says:
Slipstream.Connection.Impl.decode_message({:error, {:malformed_reserved, <<0::size(3)>>}}, ..)
And looking at the packet capture, the reserved is definitely <<0::size(3)>>
I wonder if there's some bug in OTP for bitstring matching in guards? I know they added a few new bitstring matching instructions in OTP26 and maybe the JIT definitions for those instructions have a bug? Are you running on a machine with an x86 CPU or ARM (for example the Macs with the Apple CPUs)? I'm on x86 but I can try reproducing this on ARM.
Ah ok I can reproduce this on a macbook! I'll dig into this a little within OTP and try to fix it or at least make a nice report there.
Amazing! Should have mentioned I'm in Apple silicon
Thanks for investigating and opening the issue in otp @the-mikedavis! Looks like they fixed it 🙌 Do you know how often they do patch releases?
Hmm I'm not sure. I think it will be sooner rather than later since there are a bunch of new fixes on the maint
branch. My experience has been a new patch release every few weeks after a major release to iron out bugs like this.
Perfect! Thanks a lot 😄
I tried out the new 26.0.2 patch release and I no longer see the crash :tada: https://github.com/erlang/otp/releases/tag/OTP-26.0.2
It works perfectly! 🎆 Thanks a lot @the-mikedavis this was suuper helpful! 💛
I have a small toy application to test connectivity from a Slipstream client to a Phoenix channel.
(Using Slipstream 1.1.0)
The app works perfectly in OTP 25:
However, if I switch my environment to OTP 26, I get this crashing error that seems to happen when Slipstream attempts to decode a tcp message:
Not sure if this is a known issue from OTP 26 or if there is some extra configuration that I'm missing. Any pointers?