Closed wtulloch closed 7 months ago
This causes RabbitMQ to fail and exit. If the port number is changed to something like 5673 RabbitMQ will successfully start.
Why does it fail?
Oh you might be running into https://github.com/dotnet/aspire/issues/844. It's likely you're adding endpoints that already exist to a container and it's blowing up in the WithReference call. An exception and callstack would help pinpoint the issue.
So the following is what I am currently trying (note: I have tried multiple permutations, this is the simplest):
var testmq = builder.AddContainer("test-rabbit", "rabbitmq", "3-management")
.WithAnnotation(new EndpointAnnotation(ProtocolType.Tcp, port:5672, containerPort: 5672))
.WithAnnotation(new EndpointAnnotation(ProtocolType.Tcp, port: 15673, containerPort: 15672, uriScheme: "http"))
.WithEnvironment("RABBITMQ_DEFAULT_USER", "user")
.WithEnvironment("RABBITMQ_DEFAULT_PASS", "yourguessisasgoodasmine");
After a few secs RabbitMq fails. There is a lot going on in the logs but the following snippet seems relevant:
` {listen_error, 2024-02-28T14:53:06.0998950 [38;5;160m2024-02-28 03:53:06.099408+00:00 [error] <0.593.0> {acceptor,{0,0,0,0,0,0,0,0},5672}, 2024-02-28T14:53:06.0998965 [38;5;160m2024-02-28 03:53:06.099408+00:00 [error] <0.593.0> eaddrinuse}}
`
That code works for me. Maybe you have another rabbitmq listening on the same port?
Have tripled checked, no other instances running :-)
What version of aspire are you using?
8.0 preview-3
Not sure what else it could be here.
Will keep investigating and will let you know if anything comes up
@davidfowl I have done some further work on this. RabbitMQ is fine but the Dapr sidecars are still failing.
I have created a sample project that demonstrates this which can be found here: https://github.com/wtulloch/Dapr-RabbitMq-Example
In your example you are adding another endpoint to the existing rabbitmq container instead of changing the port of the existing endpoint. AddRabbitMQContainer has a parameter which is the host port, did you try setting that?
Yes, tried that and dapr side-cars still failing (have updated the code to reflect that)
I'm confused. I meant this:
builder.AddRabbitMQContainer("messaging", password: "ThisIsAPassword", port: 5672);
Tried that still no joy. Nor is there host/container port mapping defined in docker
Try updating to preview 4 I have a hunch there are features that will make this scenario work better.
So have tried preview 4 below is an example (of the many permuations I have tried).
var rabbit = builder.AddRabbitMQ("test-rabbit")
.WithImageTag("3-management")
.WithEndpoint( hostPort: 5672, isProxied: false)
.WithEndpoint(scheme: "http", hostPort: 15672, isProxied: false)
.WithEnvironment("RABBITMQ_DEFAULT_PASS", "ThisIsAPassword")
.WithEnvironment("RABBITMQ_DEFAULT_USER", "user");
The good news is that the username and password seem to work and the RabbitMQ dashboard is accessible (port 15672).
The not so good news the dapr side-cars are still failing.
So first thought. when I view the endpoints for RabbitMQ I sse the following:
The key thing to note is that localhost
still has a custom port and and in the pubsub configuration defines the hostname as localhost
. I tried a number of different approaches to defining the rabbitmq and in all cases sidecare failure.
My second thought is that this may a start up timing issue between rabbitmq and dapr.
I will keep playing around with it to see if anything jumps out
Replace AddRabbitMQ with AddContainer.
Have done this:
builder.AddContainer("test-rabbit", "rabbitmq", "3-management")
.WithEnvironment("RABBITMQ_DEFAULT_PASS", "ThisIsAPassword")
.WithEnvironment("RABBITMQ_DEFAULT_USER", "user")
.WithEndpoint(scheme: "tcp", hostPort: 5672, containerPort: 5672, isProxied: false)
.WithEndpoint(scheme: "http", hostPort: 15672, containerPort:15672, isProxied: false);
port assignments work, can access admin dashboard. Possibly the issue is with hostname
. For the rabbitmq container endpoints are 127.0.0.1:5672 and 127.0.0.1:15672 when I think we need localhost:5672, etc.
Is dapr still not working?
cc @karolz-ms
Short answer, no
Is the error from the dapr sidecar different?
To be honest I have be remiss in tracking them 😀. In the pubsub yaml there is a reconnectWait setting that I have tried and one of the side-cars stay up but that seems to have been a one off. Repeated tries have not replicated that. Am going to try a couple of other things.
For reference the error message is below:
WARNING: no application command found.
The daprd process exited with error code: exit status 1
Error exiting Dapr: exit status 1
Starting Dapr with id message-publish-service. HTTP Port: 60589. gRPC Port: 60588
time="2024-03-15T20:23:26.3827283+11:00" level=warning msg="mTLS is disabled. Skipping certificate request and tls validation" app_id=message-publish-service instance=TG-laptopStudio1 scope=dapr.runtime.security type=log ver=1.13.0
time="2024-03-15T20:23:26.399308+11:00" level=warning msg="The default value for 'spec.metric.http.increasedCardinality' will change to 'false' in Dapr 1.14" app_id=message-publish-service instance=TG-laptopStudio1 scope=dapr.runtime type=log ver=1.13.0
time="2024-03-15T20:23:26.8590733+11:00" level=error msg="Failed to init component pubsub (pubsub.rabbitmq/v1): [INIT_COMPONENT_FAILURE]: initialization error occurred for pubsub (pubsub.rabbitmq/v1): Exception (501) Reason: \"EOF\"" app_id=message-publish-service instance=TG-laptopStudio1 scope=dapr.runtime.processor type=log ver=1.13.0
time="2024-03-15T20:23:26.8593624+11:00" level=warning msg="Error processing component, daprd will exit gracefully" app_id=message-publish-service instance=TG-laptopStudio1 scope=dapr.runtime.processor type=log ver=1.13.0
time="2024-03-15T20:23:26.912957+11:00" level=fatal msg="Fatal error from runtime: process component pubsub error: [INIT_COMPONENT_FAILURE]: initialization error occurred for pubsub (pubsub.rabbitmq/v1): [INIT_COMPONENT_FAILURE]: initialization error occurred for pubsub (pubsub.rabbitmq/v1): Exception (501) Reason: \"EOF\"" app_id=message-publish-service instance=TG-laptopStudio1 scope=dapr.runtime type=log ver=1.13.0
Could not update sidecar metadata for cliPID: PUT http://127.0.0.1:60589/v1.0/metadata/cliPID giving up after 5 attempt(s): Put "[http://127.0.0.1:60589/v1.0/metadata/cliPID"](http://127.0.0.1:60589/v1.0/metadata/cliPID%22);: dial tcp 127.0.0.1:60589: connectex: No connection could be made because the target machine actively refused it.
You're up and running! Dapr logs will appear here.
terminated signal received: shutting down
Okay, a couple of interesting side-notes:
In the previous error the Exception (501) Reason: \"EOF\""
appears to be a RabbitMQ error though checking the logs I couldn't see anything that might be related to.
This is the more interesting one. if I copy the dapr run command from the Aspire console and run it in the command line it successfully runs.
C:\dapr\dapr.exe run --app-id message-receiver-service --log-level warn --resources-path C:\Users\w_tul\source\repos\Dapr-RabbitMq-Example\components\ --app-port 5179 --dapr-grpc-port 51393 --dapr-http-port 51394 --metrics-port 51395 --app-channel-address localhost
@dbreshears I think we should investigate this for p5. I have a hunch this has to do with the DCP proxy and ipv4 and ipv6. We've seen similar issues https://github.com/dotnet/aspire/issues/2855.
@wtulloch could you share a repo/solution that demonstrates the problem? That would be super helpful.
@karolz-ms Sure, https://github.com/wtulloch/Dapr-RabbitMq-Example
Moving to P6
I poked at this a bit; near as I can tell, the issue is down to the fact that the dapr RabbitMQ PubSub component doesn't support retry during dapr init. If the RabbitMQ server isn't already up and running by the time we run the dapr sidecar for an app, it'll error out. To make this work, it seems as though the RabbitMQ dapr component would need to be updated to support retry during initialization in order to connect once the PubSub instance becomes available during Aspire startup.
@danegsta That is something that I considered. In the Dapr pubsub.yaml for RabbitMQ there is a reconnectWait which I tried but it didn't seem to make a difference.
I have the same problem. Just for testing, I created another AppHost2 in the same solution and moved rabbitmq from AppHost1 to AppHost2. Then everything works fine (AppHost2 started first and when rabbitmq is ready I run AppHost1). But, If I run rabbitmq from the same AppHost1 where my main projects are, dapr sidecars fails: "Fatal error from runtime: process component pubsub error: [INIT_COMPONENT_FAILURE]: initialization error occurred for pubsub (pubsub.rabbitmq/v1): [INIT_COMPONENT_FAILURE]: initialization error occurred for pubsub (pubsub.rabbitmq/v1): Exception (501) Reason: \"EOF\""
I'd suggest opening an issue on the Dapr components issue tracker to request adding retry/resiliency to the initial RabbitMQ PubSub connection on component init.
Closing this as external.
Currently I have a project using Dapr pub/sub and RabbitMQ as the message provider, with Aspire being used to spin up the various services. Running the RabbitMQ container seperately this all works fine.
When trying to run RabbitMQ from the Aspire project things don't go so well. The crux of the issue seems to be assigning a host port that is the same as the container port.
For example this the annotation that I have tried using:
.WithAnnotation(new EndpointAnnotation(ProtocolType.Tcp, port:5672, containerPort: 5672))
This causes RabbitMQ to fail and exit. If the port number is changed to something like 5673 RabbitMQ will successfully start.So why would we want to do use the same port number for host and container? With RabbitMQ the convention is to expose the container port on the same host number. For example if you run the Docker container the command would like this:
docker run --hostname test-rabbit --name test-rabbit -p 5672:5672 -p 15672:15672 -e RABBITMQ_DEFAULT_USER=user -e RABBITMQ_DEFAULT_PASS=haveaguess rabbitmq:3-management
This convention is also assumed by Dapr.
Now I may have gotten the wrong end of the stick and may have missed something, but I am curious as to why this isn't possible