open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
262 stars 167 forks source link

codegen question: how to handle enum values as constants #1064

Open dyladan opened 5 months ago

dyladan commented 5 months ago

Area(s)

No response

Is your change request related to a problem? Please describe.

In JS we generate our code as constants. In order to be friendly to bundlers and minifiers, we avoid using object or enums; everything is a plain constant with a primitive, non-object, type value. Currently, we are generating our enums like this:

export const ATTR_LOG_IOSTREAM = 'log.iostream';
export const LOGIOSTREAM_STDOUT = 'stdout';
export const LOGIOSTREAM_STDERR = 'stderr';

For enum values, the attribute name is squished together into a single all-caps word in order to avoid possible collisions with other attribute names. There is currently a PR open to deprecate that in favor of using normal all-caps snake case and infixing _VALUES_ before the value name like this:

export const LOG_IOSTREAM_VALUES_STDOUT = 'stdout';

During review of that change, the issue was raised that this might cause collisions if an attribute like log.iostream.values.stdout was ever registered (note: we know this specific case is not likely, but the general problem is possible).

Describe the solution you'd like

Please provide guidance as to how we should move forward with this. It seems there are a couple ways we could move forward:

  1. keep the squished names. They're not very ergonomic or readable, but at least some people think this is less likely to cause collisions.
  2. Move forward with _VALUES_ infix option above. Some people think it is sufficiently unlikely to cause collisions (I am in this group).
  3. Something else? Maybe prefix value names with ENUM_ or similar?

Describe alternatives you've considered

No response

Additional context

No response

dyladan commented 5 months ago

@MSNev CC

joaopgrassi commented 5 months ago

I think VALUES might be a bit dangerous, I think we don't have any today but I wouldn't be surprised if it comes up. ENUM sounds better I think and, if it is appended at the end, I can think of two nice things we can get from it:

dyladan commented 5 months ago

So you're suggesting LOG_IOSTREAM_ENUM_STDOUT?

joaopgrassi commented 5 months ago

Almost, I'm suggesting LOG_IOSTREAM_STDOUT_ENUM, because then we can easily add another CI check to disallow any attributes ending with enum, to reserve that word.

dyladan commented 5 months ago

STDOUT is the value though. I really want to have some separation between the enum name and value

lmolkova commented 5 months ago

We don't allow to register an attribute that uses another attribute name as a prefix. I.e. if there is an attribute called log.iostream, it's not allowed to create attribute log.iostream.anything_else.

So it should be safe to create a constant like LOG_IOSTREAM_STDOUT or LOG_IOSTREAM_VALUES_STDOUT or any other that starts with LOG_IOSTREAM_.

Am I missing something?

UPDATE: oops, we don't have a check for it - #1068, will fix

joaopgrassi commented 5 months ago

@lmolkova I also thought this, but the problem here is that in JS, for attributes that are enums they generate constants for both the attribute name + the enum values.

So the attribute log.iostream becomes a constant like

export const ATTR_LOG_IOSTREAM = 'log.iostream';

and its values have the format of attributename_enumvalue:

export const LOGIOSTREAM_STDOUT = 'stdout';
export const LOGIOSTREAM_STDERR = 'stderr';

And it's not great because it's all together and hard to read. So they thought about adding _VALUES_ in between the enum value const name to improve it. But that then can be itself a problem, if an attribute log.iostream.values.stdout is added in semconv.

Just thinking now, @dyladan what about generate the constant for the enum values as ATTR_LOG_IOSTREAM_STDOUT? 🤔

lmolkova commented 5 months ago

@joaopgrassi adding log.iostream.values.stdout is not allowed per current attribute naming conventions because there is already a log.iostream.

https://github.com/open-telemetry/semantic-conventions/blob/e29907969685fb784c653e3e083da21f3eba644b/docs/general/attribute-naming.md?plain=1#L52-L58

So there is no collision issue, at least I don't see it.

joaopgrassi commented 5 months ago

@lmolkova it's not about the attribute in our yaml models, but how they will be mapped to JS constants when passing them in codegen.

For example, if JS adopts the solution of adding _VALUES_ to enum attribute values constant names, they will be named like LOG_IOSTREAM_VALUES_STDOUT in code.

Then say, we decided to add a log.iostream.values.stdout in our model. (I know the example is not great but still it can happen). When generating the codegen again, they will run in conflicts, because there will the same JS constant name for both things - LOG_IOSTREAM_VALUES_STDOUT for the new attribute name and LOG_IOSTREAM_VALUES_STDOUT for the artificially-generated const name for the enum value of log.iostream.

dyladan commented 5 months ago

I think @lmolkova is saying log.iostream.values.stdout would not be allowed because the existence of log.iostream would break the rule "Names SHOULD NOT coincide with namespaces." In the example log.iostream is both a name and a namespace, which is not allowed.

dyladan commented 5 months ago

The only risk of collision here would be if an attribute was renamed to cause a collision similar to https://github.com/open-telemetry/semantic-conventions/issues/1031, but that can happen no matter what the rules are.

joaopgrassi commented 4 months ago

I think @lmolkova is saying log.iostream.values.stdout would not be allowed because the existence of log.iostream would break the rule "Names SHOULD NOT coincide with namespaces." In the example log.iostream is both a name and a namespace, which is not allowed.

Ah right, I got it now, I somehow thought it was just a namespace and not an actual attribute. I guess for this case then we should be fine with _VALUES_ then 👍 .

The only risk of collision here would be if an attribute was renamed to cause a collision similar to https://github.com/open-telemetry/semantic-conventions/issues/1031, but that can happen no matter what the rules are.

Yep :/

MSNev commented 4 months ago

@joaopgrassi

Just thinking now, @dyladan what about generate the constant for the enum values as ATTR_LOG_IOSTREAM_STDOUT? 🤔

Because of package size issues in JavaScript we explicitly moved away from any form of namespace, so that any code that only references a single value only needs to include that single value in any generated packages.