dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.03k stars 4.68k forks source link

Proposal: Add a codegen to update public enum CborTag with IANA CBOR Tags Registry #92567

Open mofosyne opened 1 year ago

mofosyne commented 1 year ago

Created a CBOR semantic tag code generator in python for C headers in https://github.com/mofosyne/iana-headers

This could be adapted to sync https://github.com/dotnet/runtime/blob/c8f97e8df6eaa4f21944ed80661c089403cf8f85/src/libraries/System.Formats.Cbor/src/System/Formats/Cbor/CborTag.cs#L4-L14 with IANA registry for cbor semantic tags

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-formats-cbor, @bartonjs, @vcsjones See info in area-owners.md if you want to be subscribed.

Issue Details
Created a CBOR semantic tag code generator in python for C headers in https://github.com/mofosyne/iana-headers This could be adapted to sync https://github.com/dotnet/runtime/blob/c8f97e8df6eaa4f21944ed80661c089403cf8f85/src/libraries/System.Formats.Cbor/src/System/Formats/Cbor/CborTag.cs#L4 with IANA registry for cbor semantic tags
Author: mofosyne
Assignees: -
Labels: `untriaged`, `area-System.Formats.Cbor`
Milestone: -
krwq commented 11 months ago

would you be interested in doing this? it would be nice if the code was as similar to original where possible. Note this would need to be manually generated anyway because we have API approval process (unless you only think of internal stuff) - note the existing APIs cannot change and this class seems kinda small so I'm not sure it's worth it - especially when new tags are added you'd need to add equivalent APIs anyway

mofosyne commented 11 months ago

I'm mostly a C to C++ coder, so not as familiar with CSharp.

I may help, but no immediate plans to do so for now, plus may want to just see if the C headers is received well by other CBOR C implementations first before working on a CSharp separate code generators. (e.g. found I should add depreciated enum support, so both the new/old enums can be supported).

But at the very least, the idea of my repo is to help encourage a semi-standard for cbor naming (especially since the IANA database entry does not have a name section). My code generator has some simple heuristics to read the description field and clean it up so it can at least be used as a variable name.

So I would encourage you to tackle this instead. If there is a way to make it easier for you to use it, let me know. E.g. would you like me to make my code generator output the list as a csv over python console so you can separately grok it later?

In terms of priority level. I would too deem this as low priority, but good to note, especially if you want to add new enums in manually later

krwq commented 11 months ago

I think for simple thing like generating a single class T4 templates might be much easier than source gen - C# is much easier than C/C++ (I also originally come from C/C++ world. Now world is easier and you can write a code partially by ChatGPT and just validate the logic since code should be readable enough). The major thing is that I anticipate this will generate public APIs and we cannot generate public APIs during the build as that would need to be deterministic.

You can adapt similar approach I took with ciphersuites in the SslStream class:

generator code: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/TlsCipherSuite.tt (since I needed to do some more logic there is also: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/TlsCipherSuiteNameParser.ttinclude but you likely won't need that)

Here is the code it generated: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/TlsCipherSuite.cs

note how it also uses IANA registry to generate that code: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/TlsCipherSuite.tt#L29C41-L29C41 (and track down usage from there)

note how also project file was modified to take extra flag so that this doesn't happen automatically on every build and only when we consciously want to update: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System.Net.Security.csproj#L140

mofosyne commented 11 months ago

I think I was aiming to use Python as I was trying to aim for the least common denominator and lowest dependency approach so it can be integrated in various codebases.

T4 templating looks to be a native feature of .NET Framework, .NET Core and Mono.

Nevertheless, at the moment I don't personally got the mental bandwidth to figure C Sharp currently. But I do like how integrated the code structure and template is. I'm not sure how it would deal with backward compatibility with other existing headers, but I don't really see too many CSharp CBOR implementations so I think we should be safe.

I do think we should only call the templating engine for runtime/src/libraries/System.Formats.Cbor/src/System/Formats/Cbor/CborTag.cs manually or it will throw off the build determinism.

I take the same consideration (But wondering if it make sense to print some form of warning if the iana registry has been updated), hence I don't intend my generator to be triggered on every build but rather as needed.


p.s. Regarding how I should organise my IANA Header repo ergo instead of I think my approach of organizing the folders in my https://github.com/mofosyne/iana-headers project to be incorrect. I should arrange it by language, instead of by protocols. This should make it easier to share code between our codebase.

├── cbor
│   ├── c
│   ├── c_sharp
├── coap
│   ├── c
│   ├── c_sharp
├── http
│   ├── c
│   ├── c_sharp

I should arrange it as

├── c
│   ├── cbor
│   ├── coap
│   ├── http
├── c_sharp
│   ├── cbor
│   ├── coap
│   ├── http