ArduPilot / pymavlink

python MAVLink interface and utilities
Other
493 stars 591 forks source link

message CRC checksum is wrong when a mavlink_message_t is converted to a sending buffer #237

Open WeifengY opened 5 years ago

WeifengY commented 5 years ago
issue: we may receive valid mavlink2 messages that contains "0" bytes
       in the message payload which are not generated by this library.
       for such a messages, we can succussfully convert it to a
       mavlink_message_t. we may decide to forward it to its next stop.
       when "mavlink_msg_to_send_buffer()" is called to convert the
       mavlink_message_t to another sending buffer, the "0"s in the
       playload are trimmed, but the message crc checksum is not updated
       accordingly. as a result, on the next receiving end, the buffer
       can no longer be assembled back to a mavlink_message_t.
cause: the message crc checksum doesn't match the payload content.
fix:   recaculate the crc checksum if message payload is changed.

patch is attached. recalculate_crc_if_payload_changes.zip

julianoes commented 5 years ago

This is the patch:

diff --git a/generator/C/include_v2.0/mavlink_helpers.h b/generator/C/include_v2.0/mavlink_helpers.h
index 19a76e0..cc9b385 100644
--- a/generator/C/include_v2.0/mavlink_helpers.h
+++ b/generator/C/include_v2.0/mavlink_helpers.h
@@ -15,6 +15,7 @@
 #ifdef MAVLINK_USE_CXX_NAMESPACE
 namespace mavlink {
 #endif
+MAVLINK_HELPER uint8_t mavlink_get_crc_extra(const mavlink_message_t *msg);

 /*
  * Internal function to give access to the channel status for each channel
@@ -447,8 +448,17 @@ MAVLINK_HELPER uint16_t mavlink_msg_to_send_buffer(uint8_t *buf, const mavlink_m
        ck = buf + header_len + 1 + (uint16_t)length;
        signature_len = (msg->incompat_flags & MAVLINK_IFLAG_SIGNED)?MAVLINK_SIGNATURE_BLOCK_LEN:0;
    }
-   ck[0] = (uint8_t)(msg->checksum & 0xFF);
-   ck[1] = (uint8_t)(msg->checksum >> 8);
+   if (length != msg->len) {
+       uint8_t crc_extra = mavlink_get_crc_extra(msg);
+       uint16_t checksum = crc_calculate((const uint8_t*)&buf[1], header_len);
+       crc_accumulate_buffer(&checksum, _MAV_PAYLOAD(msg), length);
+       crc_accumulate(crc_extra, &checksum);
+       ck[0] = (uint8_t)(checksum & 0xFF);
+       ck[1] = (uint8_t)(checksum >> 8);
+   } else {
+       ck[0] = (uint8_t)(msg->checksum & 0xFF);
+       ck[1] = (uint8_t)(msg->checksum >> 8);
+   }
    if (signature_len > 0) {
        memcpy(&ck[2], msg->signature, signature_len);
    }
julianoes commented 5 years ago

Something to check in the next mavlink call. @hamishwillee can you table this or add the tag please? Thx.

auturgy commented 5 years ago

Thanks Julian

Regards,

James

On 7 Nov 2018, at 7:31 pm, Julian Oes notifications@github.com wrote:

Something to check in the next mavlink call.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

hamishwillee commented 5 years ago

Perhaps better to regenerate this as a PR so that it can be merged if approved?

Possibly I'm missing something, but I don't understand why there is an expectation that the mavlink_msg_to_send_buffer() will recalculate the checksum. Also, why it would need to - just populate the mavlink_message_t with the checksum from the incoming message (it was calculated when originally sent for a message that has no zeros).

@julianoes What is it that you want to discuss about this? I have added to the 20181114 but is intention just getting review or determining whether this is even a good idea? Or to put it another way, why not follow normal PR process?

As an aside:

WeifengY commented 5 years ago
the use case is like this: A
V B -----> C

V D the connection diagram is like above. The key of the problem is that A is coded with some other Mavlink libraries, which don't trim the tailing zeros in the payload of its outcoming messages. But doubtlessly, the messages conform to Mavlink2 protocol. A sends a message to B, and B forwards the message to C or D based on the routing information (message id, target component id, etc) included in the message. On receiving the message, B converts the receiving buffer to a mavlink_message_t with function "mavlink_parse_char()". the crc checksum in this mavlink_message_t matches its payload (including the tailing zeros), so it's OK. now B decides to forward it to the next stop and calls "mavlink_msg_to_send_buffer()" to convert the mavlink_message_t to a sending buffer. As the mavlink_message_t contains tailing zeros, "mavlink_msg_to_send_buffer()" trims the zeros but doesn't change the crc checksum accordingly. B sends this sending buffer to C or D. As a result, the message received on C or D end fails to assemble the buffer back to a mavlink_message_t due to crc check failure.

hamishwillee commented 5 years ago

@WeifengY Thank you for the more detailed explanation. Some thoughts:

  1. I'm not sure a message can conform to MAVLink 2 if it doesn't strip the zero's (ie it is not "optional"). What systems are generating these messages?
  2. Messages are supposed to be forwarded unchanged. Unfortunately I don't see a mechanism for getting the original message as a buffer and pushing it out to another channel ....

Interested to hear what others think about this. It may be that we should make the code robust enough to handle this case even if it is "off spec" (and I'm not saying it definitely is!)

yaoyz-yuneec commented 5 years ago

@hamishwillee @julianoes Thanks very much for the help. It is meaningful for the transmission bandwidth to cut off 0 of the tail, but the corresponding CRC should also be synchronized. The case(buf->msg->buf) mentioned by @WeifengY will be often used by some complex systems and products.

hamishwillee commented 5 years ago

Hi @yaoyz-yuneec

I believe I do understand what you are saying. The incoming message comes in "untrimmed" and with a CRC calculated for the untrimmed string. You can't use the original CRC from the incoming message because mavlink_msg_to_send_buffer will trim the message before sending (so incoming CRC no longer matches).

My issue with this is not what is happening, but what the right response should be given that the incoming message is not compliant with the spec. After all, you're adding a CRC calculation to every single time this method is used - even though it should not be needed if the incoming message was correct.

@julianoes @auturgy Do you have an opinion?

WeifengY commented 5 years ago

@hamishwillee , We are not recalculating the CRC for every single call of mavlink_msg_to_send_buffer. recalculation only happens when there are some "0"s are trimmed. Please refer to @julianoes ' comment: https://github.com/ArduPilot/pymavlink/issues/237#issuecomment-436542656 or the patch file posted in this thread.

haiyangyuneec commented 5 years ago

There are two options:

  1. We define the spec to say with zero at end of message is INVALID In such case, a. The mavlink parse should NOT parse message or accept it, since message with zero at end of itself is INVALID. b. The trim in mavlink function should be removed, since we needn't such trim for INVALID message which should never be there. We are spending CPU for unnecessary cases.

  2. We define the spec to say with zero at end of message is VALID In such case, the trim logic makes sense to decrease the message payload, and the CRC recalculate trim is required.

In any case, trim a message without recalculating CRC is incorrect.

julianoes commented 5 years ago

@haiyangyuneec @WeifengY we discussed this in the mavlink coordination call trying to understand it and come to a solution.

We wondered again which implementation sends untrimmed mavlink 2? Is it one of the open-source ones?

We agreed that the spec is not clear on whether you have to trim the zeros or not, so it probably makes sense that messages with zeros get correctly parsed.

It was brought up in the call that you should be using the _finalize method to make sure the CRC of the message is set, however, this means that you need to calculate the CRC for all messages (trimmed or not) and you waste CPU.

I also found the function _mavlink_resend_uart (https://github.com/ArduPilot/pymavlink/blob/master/generator/C/include_v2.0/mavlink_helpers.h#L363-L411) which might work for re-sending it and does not trim the message.

My takeaway after reading all this and the notes at the meeting is that your change makes sense, or at least is a valid workaround for this edge case that exists because the spec is not clear.

I suggest that you make a pull request with the change, and hopefully we can get it reviewed.

WeifengY commented 5 years ago

Hi @julianoes , It's great to hear your update. I had a try to commit a PR to the pymavlink project but failed due to lack of write permission before I opening this issue. I'm glad to have another attempt if permission is granted.

julianoes commented 5 years ago

@WeifengY oh got it. You need to make a fork and make the pull request from there: https://help.github.com/articles/creating-a-pull-request-from-a-fork/

hamishwillee commented 5 years ago

@haiyangyuneec @WeifengY See ...

We wondered again which implementation sends untrimmed mavlink 2? Is it one of the open-source ones?

WeifengY commented 5 years ago

I'll find some people to get the answer of this question. It may take some time.

WeifengY commented 5 years ago

@julianoes . Well, your instruction helps. I made the pull request. https://github.com/ArduPilot/pymavlink/pull/241

WeifengY commented 5 years ago

Hi @julianoes, @hamishwillee , The implementation who sends untrimmed mavlink messages is not from an open source project.

hamishwillee commented 5 years ago

Hi @WeifengY

Thank you.

The implementation who sends untrimmed mavlink messages is not from an open source project.

Even though we are probably going to add support for handling this kind of message, it is non compliant. I have added a PR to make this clearer in the docs: https://github.com/mavlink/mavlink-devguide/pull/135

Can you forward this information to the implementation and let them know that they should change this?

haiyangyuneec commented 5 years ago

Again, if untrimmed message is invalid, please check these,

a. The mavlink parse should NOT parse message or accept it, since message with zero at end of itself is INVALID. b. The trim in mavlink function should be removed, since we needn't such trim for INVALID message which should never be there. We are spending CPU for unnecessary cases.

hamishwillee commented 5 years ago

@haiyangyuneec What https://github.com/mavlink/mavlink-devguide/pull/135 says is that it is invalid to send an untrimmed message, but not invalid to receive one.

The spec has always said that messages should be trimmed, but that was not forceful enough.

The spec mentioned did not mention the case of receiving untrimmed messages because we assumed there would be no one sending such messages. However now that we have a non compliant sender we have a choice to accept the messages or not. This update to the spec says that we will. If you disagree, then please add your reasons as a response to https://github.com/mavlink/mavlink-devguide/pull/135

haiyangyuneec commented 5 years ago

I am quite confused. Why accept invalid message and the parser can get invalid message out of the stream? Even accept the invalid message, the trim logic is definitely wrong without recalculating crc after trimming.

Everything is clear and it is up to you to fix such bug or not.

WeifengY commented 5 years ago

Hi @hamishwillee , Thanks a lot for accepting it even though the message may be not strictly comply with the spec. I can inform the message sender of the risk of the possible incompatibility with the spec. However, let's make one step backward. I assume there are lots of programmers around the world who are interested in playing with drones, some of them are professionals but many are merely amateurs. So you cannot force all of them not to send out "untrimmed" messages, if such messages can be well parsed and accepted. So the best thing the library can do is either to reject this kind of messages OR make them fully compatible. It's not a good idea to make it "partially" compatible, that is accepting the messages on the receiving end but failing to resending them on the forwarding port.

hamishwillee commented 5 years ago

Hi @WeifengY @haiyangyuneec

Note that neither the refinement to the spec or the PR have been accepted. We don't have a test case for this so even though allows this is easy for C we need to have a look at the other language bindings too. A test case is being written. If all the other language bindings fail, we may have to revisit this.

BUT, if it is possible then ...

Why accept invalid message and the parser can get invalid message out of the stream?

Because we wanted to make things easier for you, and because it is easy to do with our current implementation.

Even though we will effectively allow non-compliant messages to be sent by letting systems handle them, we can still go to vendors who are sending untrimmed zeros and say "you're not compliant". This means we can encourage the right behaviour without breaking you.

So the best thing the library can do is either to reject this kind of messages OR make them fully compatible. It's not a good idea to make it "partially" compatible, that is accepting the messages on the receiving end but failing to resending them on the forwarding port.

Just to be clear, the proposed refinement to the spec will require the non-compliant messages to be resent. I just changed https://github.com/mavlink/mavlink-devguide/pull/135 to make that clear.

But again, it might not happen if the effort to update the other implementations is too great. As you say, in some ways it is better to just reject the invalid messages and remove any confusion or doubt.

magicrub commented 4 years ago

any movement on this?

hamishwillee commented 4 years ago

@magicrub No. This is waiting on "someone" to write a test case for mavlink 2 untrimmed messages. FWIW IMO we'd have been better saying they are not allowed.

iottrends commented 1 year ago

Hi, I am also facing bad crc error, I am using a transparent datalink with 250 byte pkt limit... any fix that handles partial Mavlink messages... As stats show link quality as ~50% image