Closed sh-developers closed 5 years ago
It looks like it is a string or similar that is too large for the encoding used. The value complained about looks like it is the size, so e.g. a str8 or perhaps a vbin8 that is too large. However there are alternate encodings for string and binary types that I would expect to get used. Are you explicitly wrapping any values in your message with a particular type?
Are you able to create a simple reproducer? Or perhaps get some protocol traces from a run of the application to see if there is any trend in the data?
I noticed we just send the entire body as a simple Javascript object. As a fix I have tried to send the body as a JSON-encoded string to see if that fixes it.
Any update on this?
Problems seem to be solved when serializing the body as JSON before placing a message on a queue. (and deserializing after)
I would recommend adding some notice/warning/recommendation to serialize any body content before placing on the bus, or what the limitations of the AMQP serialization formats are. (Evidently we did not realize the limitations of the format when reading either the documentation of amqp10 or rhea.) For example that with larger messages, serialization of the body is better done with a dedicated message format like messagepack/JSON/...
If you would like, I could formulate something for the readme, but someone that actually implemented the encoding/formatting code would be better suited for this.
This ticket is solved for us now, thank you for your help and attention!
Thanks for the update. Serialising javascript objects should not cause this error, so potentially there is still a bug. I can't see anything obvious in going through the code though.
We're getting this issue as well.
I think the number is the link_credit on the frame object called in session.js on the output method.
For e.g. we get "The value of "value" is out of range. It must be >= 0 and <= 255. Received -3" with a link_credit of -3 The frame:
delivery_count:951
drain:undefined
echo:undefined
handle:0
incoming_window:1866
link_credit:-3
next_incoming_id:951
next_outgoing_id:0
outgoing_window:4294967295
properties:undefined
@MattOakley this is for a receiving link? (i.e. first thing created on the session was a receiver not a sender). It is certainly a bug for the link credit to go negative. Do you have a simple test or reproducer to isolate this? Is it possible you ever call receiver.add_credit()/flow() with a negative number?
@MattOakley I pushed a check in the one place I see where the credit go in theory go negative (https://github.com/amqp/rhea/commit/230e71767e25991f02cb564bf656af1faa8a48d5); does that impact your use case?
That does impact our use case, I'll run a test with it tomorrow.
Although I can't see why that method would be hit when the credit is zero. Wouldn't this mess up the amount of inflight messages (or worse)?
In answer to your earlier question:
It is a receiving link and I don't have a simple script to reproduce it to hand, I could hopefully make one on monday if it helps.
We have a relatively long running on message handler (approx 1min) and the issue happens after a minute or so of consistent message flow. The credit window is 0 and when the message hander is finished we call accept() and add_credit(1), This is the only time we call this.
I agree, I don't see why the credit is zero when a transfer is received unless the credit is reduced by the application (or there is some other aspect I have not yet understood).
If you are able to create a reproducer of some kind that would really help tracking it down.
We are sending to queues on sender links as well, not sure if that could be important?
Below is a reproducer script:
The change (https://github.com/amqp/rhea/commit/230e71767e25991f02cb564bf656af1faa8a48d5) fixes the connection error being thrown but, now writes a lot of Received transfer when credit was 0
messages.
Also there's a typo which makes the logs print out NaN.
import * as container from "rhea";
const queueName = "rheaTest";
let senderConnection = container.connect({ port: 8091, host: "localhost" });
let receiverConnection = container.connect({ port: 8091, host: "localhost" });
senderConnection.on("error", async e => {
console.log("Sender Error: " + e);
debugger;
});
receiverConnection.on("error", async e => {
console.log("Receiver Error: " + e);
debugger;
});
receiverConnection.on("receiver_open", (context) => {
context.receiver.flow(100);
});
let sender = senderConnection.open_sender({
name: queueName,
snd_settle_mode: 0,
rcv_settle_mode: 0,
target: {
durable: 2,
address: queueName
}
});
container["once"]("sendable", (context: container.EventContext) => {
let message = Array.from({ length: 250 }, (x, i) => i).join("");;
for (let i = 0; i < 1000; i++) {
let x = sender.send({ body: message, application_properties: [], durable: true, correlation_id: i });
}
let receiver = receiverConnection.open_receiver({
name: queueName,
autoaccept: false,
rcv_settle_mode: 0,
snd_settle_mode: 0,
// Issue flow manually for async message handler.
credit_window: 0,
source: {
durable: 2,
address: queueName
}
});
receiver.on("message", async (context: container.EventContext) => {
console.log("received: " + context.message.correlation_id);
let seconds = Math.floor(Math.random() * 6 + 25);
await waitSeconds(seconds);
console.log("resolved: " + context.message.correlation_id);
context.delivery.accept();
receiver.add_credit(1);
});
async function waitSeconds(seconds: number): Promise<{}> {
return new Promise(resolve => {
setTimeout(() => {
resolve();
}, seconds * 1000);
});
};
});
Running node 10 on windows Using RabbitMQ with amqp 1.0
The message Received transfer when credit was 0
was printed out 49 times for 1000 messages
Thanks! I can reproduce with that and it looks like a bug in the rabbitmq 1.0 support (I'm using v3.6.16 of the 1.0 plugin in my test). It is not honouring credit at all. If a receiver link is attached and no credit is issued, the broker still sends messages. I've verified that with an entirely different client.
Fyi I've raised the following issue: https://github.com/rabbitmq/rabbitmq-amqp1.0/issues/67
Above issue is fixed from 3.7 apparently; I upgraded and indeed the simple issue is gone. However I can still reproduce the problem and it does still look to me like an error from the server where it sends transfers for which credit has not been issued, just more subtle.
I've confirmed with wireshark trace that rabbitmq server will occasionally send more transfers than it is entitled from granted credit. Issue is already reported https://github.com/rabbitmq/rabbitmq-amqp1.0/issues/50
@MattOakley I have fixed the typo in the error message. The current code ensures that the credit counter doesn't go negative when the server violates the protocol by sending more transfers than credits granted.
We could close the receiver with an error since the server is violating the protocol. If that would be preferred, let's raise a separate issue for that.
Ok, thanks for your help
Background
We implement request/response communication pattern via AMQP and use Qpid (C++ version) as the amqp broker. We ported our applications from amqp10 to rhea because when we moved from node 8 to node 10, we started getting buffer write errors (very similar to the one I'm about to show you).
We create our receiver queues via the same connection using the QMF2 framework (which is probably not relevant, because the error only occurs on sender connections, for which the queue is created by other services).
Description
The errors we have encountered with rhea only start showing up after the application has run for a while (in this case a day on a low traffic test server). These errors start with the number 256 and start counting up from there (281 in the example). We have only observed this error occurring when we are sending to an already existing queue.
Example
Further observations
Because the number always counts up, we think that the number being written might be a counter of some sorts? We use a correlationId consisting of a string (ULID-identifier), so it cant be that. All other information being non-counting, the only other number we send ourselves is the userId number, which is always the same on this environment, and is always observed to be in the tens not in the hunderds (and counting up).
If you need any further information, please tell us.