encoding/*: allow exporting comments

romange commented 3 years ago

Is your feature request related to a problem? Please describe. Sometimes 3rd party systems require annotating configurations via comments. For example, cloud-init requires that its yaml files have a she-bang line like this:

#cloud-config
<some yaml stuff>
......

See here https://cloudinit.readthedocs.io/en/latest/topics/examples.html for more examples.

Describe the solution you'd like File test.cue

//cloud-config
foo: "bar"

hello: {
        // occupy
        world: "from mars?"
}

when exported via command cue export --out yaml --preserve-comments test.cue should produce something like this:

# cloud-config
foo: bar
hello:
  # occupy
  world: from mars?

Describe alternatives you've considered There is an easy workaround when wrapping cue-cmd via bash script. It's suboptimal but it's not a blocker.

Additional context Bonus point: Some comments may be relevant to cue language itself thus we do not really want to export them. It could be nice to have a convention to export whitelisted comments. For example, only comments starting from /// will be exported.

myitcv commented 3 years ago

This request seems reasonable to me for one main reason: we import comments, hence we should export them for round-trip-ness' sake if nothing else.

FWIW it's not possible to control the export comments at all (unless I am missing something): neither through cmd/cue, cue cmd, nor encoding/*.

verdverm commented 3 years ago

@myitcv I agree that this makes a lot of sense for the round-trip-ness. I also did not see any way to export comments. My example above was a modified cue ( add cue.Docs(true) to this line https://github.com/cue-lang/cue/blob/master/pkg/encoding/yaml/manual.go#L38).

Some thoughts after poking around

The example above adds a space within the comment string, I think we might consider removing the space here: https://github.com/cue-lang/cue/blob/master/internal/encoding/yaml/encode.go#L300 though i
Tt might be worth considering how we might preserve the spacing / padding in the original sources better. Does this need to happen for import as well?
My initial thought was to use the Options variadic func args, much like Syntax from the API since that is used where the change is needed. That being said, I'm not sure it makes sense to open up the Options that much for the encoders, as incorrect Options may cause errors in the encoder itself.
Does it make sense to allow different options depending on the output? I was mainly thinking that the CUE output could support the full set of Options, not sure if there is any overlap with CUE output and the feature request to support single file output for CUE we have seen. Protobuf looks to have some extra flags for exporting already.

mpvl commented 2 years ago

Makes a lot of sense. One alternative API is to export comments when used with def. I'm surprised that cue def --out yaml doesn't already export comments, actually.

felixge commented 2 years ago

I just ran into this problem and submitted a PR for it based on the approach suggested by @verdverm. I'm happy to adjust it if a different approach is preferred.

https://github.com/cue-lang/cue/pull/1549

rogpeppe commented 2 years ago

@felixge I was about to land your PR but first needed to fix a test case. In fixing that, I realised that perhaps this issue isn't as straightforward as it might seem.

Here's the test case in question: https://review.gerrithub.io/c/cue-lang/cue/+/536265/2/cmd/cue/cmd/testdata/script/cmd_github.txt

Note all the comments that have appeared in the YAML output for the test (a Github Actions file). Those comments aren't really relevant to the output file - they're really there to describe the schema of the actions file, not the final rendered result. In the final rendered result, I'd imagine that more repository-specific comments would be appropriate.

Choosing a heuristic that decides which comments to be rendered in the final concrete output is tricky and will need some design input. Perhaps we might need to use an annotation hint; it could even be that it's not, after all, appropriate to include comments in rendered output.

So I've left the NeedsDesign label on this issue and removed the GoodFirstIssue label because it's clearly not entirely trivial!

Thanks for your input. Any further thoughts on this would be appreciated.

felixge commented 2 years ago

@rogpeppe I agree that this feature would benefit from a sophisticated design. But in the short term, I don't see the harm of letting the user chose to either include comments or not when they invoke the cue command (via a flag). It seems like most use cases would be well served by this?

nyarly commented 1 year ago

So long as we're in a NeedsDesign stage - got here because I was looking to produce comments based whether CUE processing of input data had fallen back on default values.

I realize that's a whole extra can of worms, but perhaps something like this could be made to work:

struct: {
  value: 17 @comment("from default")
}

denniskern commented 10 months ago

Hi Guys - Are there any ideas of how to proceed with this issue? It would be so nice to be able to export comments from a cue file.

amackenzie commented 8 months ago

This might be slightly out of scope, but...

To add a small comment, the OP brought up shebangs, and there's at least a few places where shebangs are used regardless of their validity in the normal export format. The one that comes to mind is Saltstack -- if you're using anything other than the Jinja|YAML renderer, you need a shebang at the top to designate it, eg, #!json, even though that isn't valid JSON. (And hypothetically you'd put a #!cue line at the top if there's a working cue renderer for Saltstack.)

So maybe in addition to comments there might need to be some consideration or syntax for fixed annotations of whatever format in outputs? It may affect importing if you want to be able to round-trip data, too, though at least in the specific case of shebangs they're generally the first line so it's skippable...

DavidGamba commented 3 months ago

@jpluscplusm described my need for round-tripping YAML comments very concisely here:

please let me round-trip a given YAML file to CUE and back again, so that I can actually have 2 sources of truth, and give my 2 camps of users (team YAML vs team CUE) a way to work in their encoding of choice, without [bothering] the other set of users. /Whatever/ that implies, please just let my users get on with work, whilst I incrementally convince team YAML folks to defect to team CUE, and prevent pointless arguments about this stuff at the user-to-user level.

Source: https://cuelang.slack.com/archives/CLT4FD7KP/p1694007965956239

jpluscplusm commented 3 months ago

@jpluscplusm described my need for round-tripping YAML comments very concisely here:

please let me ...

For completeness, I will just mention that this paragraph was written and posted in the context of me role-playing as David, projecting my impression of their underlying, real-world requirements. I wasn't expressing what I needed, personally! :-)

mvdan commented 2 months ago

I began importing https://github.com/cue-lang/cue/pull/1549 into Gerrit in separate commits - first testdata files, then the Docs API fix that Felix spotted, and then after would come the enabling of comments when encoding YAML. See https://review.gerrithub.io/c/cue-lang/cue/+/1199690.

The Docs change uncovered a bit of a hidden problem: we never intended for cue export to output comments by default, as can be seen in filetypes/types.cue. It just does for cue export --out=cue now due to the bug that Felix is fixing in the PR.

It's also reasonable for a user to want to either export with or without comments, so we should support both modes - this would require a new flag, at least to enable or disable the inclusion of doc comments. This would also require threading this option through the internal/encoding Go API as well as each of the encoding Go APIs, such as encoding/yaml.

Moreover, @rogpeppe and @mpvl argue that a boolean flag and option might not be enough. If CUE values unify concrete data with schemas, for example, they might both have doc comments. In some cases the user might want to output the comments from just the data, or from just the schema, or both combined. So perhaps a flag where one can do --doc=all, --doc=schema, --doc=data, --doc=schema,data, and so on would be better.

I will continue with this work step by step, but because we want to add the flag before "fixing" cue export to not output comments as intended, and multiple encodings are involved, this may take me some time.

I will use this issue to track the ability to enable exporting comments via cmd/cue and each of the encoding Go APIs. Some encodings simply don't support encoding or decoding comments yet, so that should be tracked separately, e.g. https://github.com/cue-lang/cue/issues/3342. This issue originally used YAML examples, so I'll aim to support comments in YAML as the first end-to-end example of this working.

KeranYang commented 1 month ago

Hey @mvdan , my team can really use this feature in our project. Do we expect this to be available any time soon? Thanks!

mvdan commented 1 month ago

@KeranYang after getting TOML to a working state and other work, this is back near the top of my TODO list. I can't promise any particular timeline but I hope to make progress here in the coming weeks.

KeranYang commented 1 month ago

Thank you very much @mvdan . Looking forward to it.

cue-lang / cue

encoding/*: allow exporting comments #1180