lcm-proj / lcm

Lightweight Communications and Marshalling
GNU Lesser General Public License v2.1
980 stars 388 forks source link

Limited ability to rename fields without hash change #341

Closed jwnimmer-tri closed 2 years ago

jwnimmer-tri commented 3 years ago

I'm curious to get community input on a proposed lcm-gen feature to allow for renaming fields in the generated code, while leaving the message hash unchanged.

Imagine that I have a widely-used message like this:

struct foo {
  int64_t timestamp;  // Given in microseconds.
  ...
}

Instead of explaining the units only in a comment, it would be nice if I could use a more meaningful field name:

struct foo {
  int64_t microseconds;
  ...
}

However, if I make this change to the field name, the message hash will change. That means that two processes compiled with different revisions of the message will be unable to communicate, and log files will be unable to easily load old messages.

Similarly, if I had a typo in the original name (e.g., timesatmp), I would like to correct it, but the same problems arise.

In many cases a changed hash is important and desirable -- if I changed the meaning of a field, I definitely want a new hash even if it's still an int64_t.

However, for cases where the meaning is the same but we merely want to improve the presentation of the message when using the generated message parsers, what if we offered a new syntax?

struct foo {
  int64_t microseconds __attribute__((hash_as(timestamp)));
  ...
}

The effect here would be that in generated code the field would be named "microseconds" but the message hash would reflect the name "timestamp" instead. That would allow interop with messages encoded using the old name, but always present the new name in the generated code.

Does this sound like a helpful feature? Would you consider accepting a PR along these lines?

For the syntax, I chose an inspiration from GCC spellings. I could go with a more modern one also:

struct foo {
  int64_t microseconds [[hash_as(timestamp)]];
}