actions / runner

The Runner for GitHub Actions :rocket:
https://github.com/features/actions
MIT License
4.76k stars 925 forks source link

Support YAML Anchors #1182

Open kyeotic opened 3 years ago

kyeotic commented 3 years ago

Note to incoming readers (2024)

This issue has been open for several years, and despite commitment from staff that it would be done there has been no progress. I will update this if anything changes, you don't need to read this entire thread.

UPDATE Aug 2024: The latest comment from staff says they will be looking at this next year

Thanks for engaging on this all, this is on our list to get to in the early part of next year. Right now we are re-working the services in Actions which do our YAML expansion (focusing on how we improve their availability). Once we are live with the new services we will start looking at adding features and this is on our list <3

If you want to help, upvote this comment by reacting to it with πŸ‘. This will help it sort higher in the issues list.

Adding "me too" or "+1" comments does not help. Issues are not sorted by the number of comments, they are sorted by the number of reactions to the top issue comment (this comment). The only thing you accomplish by adding "me too", or "+1" is to send an email to issue subscribers (the other people who have commented, not staff). This will not help the issue get done faster, it does not get seen by GitHub staff.

Original Issue


Describe the enhancement Support YAML anchors in in workflow files

Code Snippet

  - env: &docker_cred
      DOCKER_USER: ${{ secrets.DOCKER_USER }}
      DOCKER_PASS: ${{ secrets.DOCKER_PASS }}
# ...
  - env:
      <<: *docker_cred

OR

 anchors:
  docker_hub_credentials: &docker_hub_credentials
    credentials:
      username: my-dockerhub-username
      password: ${{ secrets.DOCKER_HUB_PASSWORD }}

  postgres_service: &postgres_service
    postgres:
      image: mdillon/postgis:9.6
      <<: *docker_hub_credentials

Additional information This has been asked for before, in other places

Note to implementors

This is generally handled by YAML parsers automatically. Perhaps it has been disabled, or a parser that does not support it is being used. This might be as simple as switching YAML parsers.

freitasmurillo commented 3 years ago

3+ years of people complaining and u don't take any action? Either, say u won't do or do it already. not a big deal to implement.

DanySK commented 3 years ago

This is truly absurd. I think that, given the existence of so many YAML parsers, it is harder to not support anchors and merge keys than doing it. Just let a normal YAML parser run.

kyeotic commented 3 years ago

@DanySK (and to a lesser degree @freitasmurillo), while I appreciate the support on issues, since upvotes help show priority, I don't think hurling abuse at the maintainers or anyone else is going to get this done any faster. If either of you are professional software developers I'm sure you can relate to working on project's whose feature requests exceed your team's bandwidth. There are 300 open issues on this repository and some of them describe actual bugs. Surely this is not the highest priority ticket.

Please keep in mind there is a human on the other side reading these messages. Consider how you might react to someone who said as much to you.

DanySK commented 3 years ago

Hi @kyeotic, I have no intention to be rude to people, much less to those whose work benefits the FOSS community so much. I expressed a technical concern that is puzzling me from a software engineering perspective, and that you also highlighted in the first post. I did not think that could hurt others -- usually, there is either some deep issue behind it, or it could just be a matter of applying small correctives.

That said, yes I know that it's impossible to keep up with the pace of feature requests in many cases, and especially so for FOSS. Given that, if someone can help me figure out where the YAML parsing happens, I'm willing to help.

midgleyc commented 2 years ago

The error message appears in YamlObjectReader.cs which references YamlDotNet.Core (and the csproj references version 5.3.0), so I expect that's the parser.

They're a few major versions behind, but it looks like there are bugs even on the latest version, e.g. this one from August. I'm not sure if that can be ignored, because they're not using the serialization from YamlDotNet, only the parser, and they've written their own custom code on top of the library.

So in short, I think it's a custom parser, written on top of another parser. The parser looks very local in how it parses things, so supporting anchors might require a bit of a rearchitect, which would explain why it's been open for so long.

freitasmurillo commented 2 years ago

@kyeotic sorry for being late to get your response. I definitely wasn't abusing anyone, sorry if that is what it sounded. Not having English as my primary language probably helped a lot. 😞

What I was trying to convey is that 3 years without replying seemed to me a lot, u know? This feature would be amazing and I know that they have a lot but I think once in a while they could provide updates and let people know what is on their plate or not for the moment.

If they don't have a minimal clue on if they will do or not post it on the thread. πŸ’₯

freitasmurillo commented 2 years ago

For the feature request POV I am more than happy to help anyhow.

aramfe commented 2 years ago

As much as we love GitHub Actions, the lack of this really makes us think about to switch back to GitLab. Re-using steps shouldn't be that complicated to work with (composite actions, etc. aren't really comparable to in-file templates).

Right now, GitHub doesn't seem to focus on their enterprise and power users, which need to work with complex workflows. It's understandable, as GitHub is very much more driven by the open-source community, but it nevertheless doesn't change the fact, that a lack of something like this, is very frustrating for enterprise users.

It simply makes it hideous to build complex workflows without a feature like that, even though GitHub has the best fundaments for building complex workflows in comparison to any other CI/CD platform.

When we evaluated both, GitHub and GitLab, we always expected in-file templating support to be released very soon for GitHub and that's why we never really considered the lack of this, as a decision defining point in the outcome of our evaluation. But looking back to it from today, it would have definitely changed the outcome of our evaluation, without any doubt.

GitLab CI / CD supports this since more than like three years already, and I haven't seen anyone, who didn't use it for their pipeline definition files. In recent time, they supported this without any anchors, which made it even easier to work with.

I would really love to see this being supported here as well.

umarcor commented 2 years ago

Right now, GitHub doesn't seem to focus on their enterprise and power users, which need to work with complex workflows. It's understandable, as GitHub is very much more driven by the open-source community, but it nevertheless doesn't change the fact, that a lack of something like this, is very frustrating for enterprise users.

@aramfe, your wording seems to imply that enterprise and power users have more complex requirements than the open-source community. I'd say that most enterprise and power users in GitHub are producing open-source software. Thus, that's a false dicotomy.

kyeotic commented 2 years ago

I'd say that most enterprise and power users in GitHub are producing open-source software

I don't know if you have any data to back this up, but I would be very, very surprised to learn that this was true. I expect most enterprises, even those using GitHub, to be developed closed-source, proprietary software.

umarcor commented 2 years ago

@kyeotic it's an opinion, based on the enterprises I know which use GitHub. Nevertheless, the point holds regardless of quantitative data: licensing of a software project is unrelated to the complexity. Moreover, the difference between enterprise users and others is that enterprise clients pay GitHub, not the licenses they use or the complexity of their projects.

mehran-sayyah commented 2 years ago

Ok

umarcor commented 2 years ago

FTR:

mithro/actions-includes allows to work around the lack of YAML anchor support, by preprocessing Workflow files.

Alternatively, in hdl/containers, a Reusable Workflow is combined with a Python script in order to dynamically generate the matrix by processing a YAML file with anchors.

A similar approach is used in the Params Reusable Workflow of pyTooling/Actions: Params.yml.

mehran-sayyah commented 2 years ago

1059

michael132 commented 2 years ago

would love to have anchor support in github workflow files as well :)

ethomson commented 2 years ago

I agree, we need to do this. In the meantime, we've made a number of improvements in simplifying complex workflows, like adding reusable workflows. There are some tricky bits about this implementation - as we have multiple components that deal with workflows (it's not just the runner in this repo) - but this is on the backlog. I don't have an ETA, but yes, we will do this.

Thanks everybody for the feedback.

caspiano commented 2 years ago

This is a desperately needed feature. Without it, basic things like abstracting out a single static value requires the use of the env key, which is not supported across all of the GitHub Actions format.

Baklap4 commented 2 years ago

This would be really nice to have...

cragster2 commented 2 years ago

Coming from other CI/CD platforms such as Bitbucket Cloud pipelines, this is a must-have feature. Especially, since it's been defined in the YAML specification for more than 10 years (anchors and aliases)

Hope this gets implemented soon!

Thanks

ssbarnea commented 2 years ago

To be honest, the only way to persuade the team to implement this is to tweet about it. That is one of the most basic parts of YAML specification which they decided not to bother to implement when they went DIY instead of using an existing open-source implementation for the loader.

In fact the claim of being YAML is really shaky. It should be named partial-yaml, yaml-like, subset-of-yaml or crippled-yaml but not yaml.

kyeotic commented 2 years ago

@ssbarnea Please do not do that, that's horrible. Everyone, please consider that the "me too" comments are not useful. Issues are not sorted by the number of comments, and they provide noise for those following this thread in hopes of a response from GitHub. Upvote the original comment and leave it at that.

Do not harass GitHub employees. Do not harass anyone. Be kind.

ssbarnea commented 2 years ago

Nobody was named here but the reality that this was part of the specification for more than 15 years, long before even the first version of GH actions was made available. From day one it was known that yaml supports anchors for reducing duplication, and that feature is not an unsafe one.

What it would really be nice is to have a sincere official reply that would state if there is planned for work to address this is like next year or not. So far I did not see anything in that area even it lots of people were quite surprised that they cannot use what is standard yaml syntax.

Again, I do understand that there are no guarantees but some transparency would likely be just what others are looking for.

toast-gear commented 2 years ago

shame on twitter or similar social medium

@ssbarnea dude, you've literally said people should go and hunt down GitHub staff and harass virtually. How you are not seeing what is wrong with that is beyond wild.

Nobody was named here but the reality that this was part of the specification for more than 15 years, long before even the first version of GH actions was made available. From day one it was known that yaml supports anchors for reducing duplication, and that feature is not an unsafe one.

What it would really be nice is to have a sincere official reply that would state if there is planned for work to address this is like next year or not. So far I did not see anything in that area even it lots of people were quite surprised that they cannot use what is standard yaml syntax.

Again, I do understand that there are no guarantees but some transparency would likely be just what others are looking for.

You're even trying to justifying it now. It's a CI/CD system, no one needs to be harassed because it doesn't have a feature you'd like.

ssbarnea commented 2 years ago

shame on twitter or similar social medium

@ssbarnea dude, you've literally said people should go and hunt down GitHub staff and harass virtually. How you are not seeing what is wrong with that is beyond wild.

Nobody was named here but the reality that this was part of the specification for more than 15 years, long before even the first version of GH actions was made available. From day one it was known that yaml supports anchors for reducing duplication, and that feature is not an unsafe one. What it would really be nice is to have a sincere official reply that would state if there is planned for work to address this is like next year or not. So far I did not see anything in that area even it lots of people were quite surprised that they cannot use what is standard yaml syntax. Again, I do understand that there are no guarantees but some transparency would likely be just what others are looking for.

You're even trying to justifying it now. It's a CI/CD system, no one needs to be harassed because it doesn't have a feature you'd like.

I redacted the initial comment. I should have read it twice before posting as I seen much later that it could be interpreted wrongly. If anyone was offended, please accept my apology. I never intended to suggest that tagging individuals on twitter might be a good idea to achieve something, most likely only to annoy them.

JoshuaKemmerer commented 2 years ago

It appears that this vital feature is not on the public roadmap yet: https://github.com/orgs/github/projects/4247/views/1?filterQuery=label%3Aactions+

aalbacetef commented 2 years ago

+1 for this

scolastico commented 2 years ago

heavy +1 for this. even gitlab supports that

LeeAlephium commented 2 years ago

Wow this issue is bonkers :zany_face:

sammcj commented 1 year ago

Any update on this? It's a pretty fundamental YAML concept that's available on other CI systems, for example this works great in GitLab: https://docs.gitlab.com/ee/ci/yaml/yaml_optimization.html

umarcor commented 1 year ago

@sammcj there was no significant update in almost three years. This feature was requested as soon as GitHub Actions were released: https://github.com/actions/starter-workflows/issues/245. Meanwhile, workarounds such as composite actions or reusable workflows have been implemented, which are double betting on doing their own stuff rather than reusing standard solutions.

flobernd commented 1 year ago

Well, composite actions and reusable workflows are nice features. But they solve a different kind of problem and are definitely no replacement for YAML anchors.

If I looked correctly, the YamlDotNet library is used internally. This one already comes with YAML anchors support out of the box. Probably some non-technical - or at least not runner-related - issue is blocking them πŸ˜•

umarcor commented 1 year ago

@flobernd see https://github.com/actions/runner/issues/1182#issuecomment-923115027:

I'm not sure if that can be ignored, because they're not using the serialization from YamlDotNet, only the parser, and they've written their own custom code on top of the library.

So in short, I think it's a custom parser, written on top of another parser. The parser looks very local in how it parses things, so supporting anchors might require a bit of a rearchitect, which would explain why it's been open for so long.

flobernd commented 1 year ago

@umarcor Meh, that explains it, I guess.

ssbarnea commented 1 year ago

That is because they decided against using a standard YAML loader and wen DIY, making a very-incomplete implementation of the YAML specification(s). One hilarious thing is that they went on using "on" as key, when almost anyone that ever used YAML knows that this loads as "true" and not a string, at least with specifications up to v1.1. v1.2 of YAML specification loads it as string but there few loaders for it.

If I remember correctly the first version implemented were using either XML or JSON, so going YAML-ish were likely under external pressure from early users and because every else found it... human-friendly.

It should not be very hard to address this and use a standard loader, but someone needs to do it and obviously it cannot be someone from outside, like us.

While outside the subject of this ticket, I would personally find very useful to replace the DIY templating with jinja, which is far well known and easier to extend, not to mention with a sintax very similar with current one. Again, I am not aware of a jinja templating library for dotnet that is really maintained. Doing a google search, brings few experiments but not something that one would find reusable, maybe become open-source authors usually avoid dotnet for some "inexplicable" reasons.

I would not be surprised to hear that the original author left the team long time ago or even the company, putting the library in some kind of dont-touch-it mode. It happens.

flobernd commented 1 year ago

It should not be very hard to address this and use a standard loader, but someone needs to do it and obviously it cannot be someone from outside, like us.

From what I read, the runner is not the only code that needs to be able to parse the workflows, so probably there are other components that require refactoring as well.

While outside the subject of this ticket, I would personally find very useful to replace the DIY templating with jinja, which is far well known and easier to extend, not to mention with a sintax very similar with current one.

While this looks very nice, I don't think it would rather be an addition instead of a replacement. Reusable workflows / composite actions are not just templates but require a lot of extra logic (like pulling the foreign repositories, etc). Other than that it seems like DevOps CI Templates supports a more advanced template syntax.

maybe become open-source authors usually avoid dotnet

YAML is not widely used in .NET languages at all. Sadly, XML/JSON is the de facto standard here for historical reasons. Even the YamlDotNET library - which seems to be the only one out there - has some serious issues and does not seem to be maintained very actively anymore.

umarcor commented 1 year ago

@flobernd you are correct with regard to other several components needing refactoring as well. There is a lot of stuff that is resolved "too early", as if the content of the workflows was static.

Find these and other limitations in the following list: https://github.com/hdl/containers/issues/48.

Overall, the logic would need to be rewritten from scratch in order to resolve stuff "just when needed, but not before". I seriously doubt it will be done any time soon, if ever.

Fitzsimmons commented 1 year ago

The sad part about this is that you can just pipe your file into yq 'explode(.)' as a preprocessing step. It's awkward to have to do this yourself, pre-commit, but it is a pretty straightforward hack to be able to access all of the features of yaml. GHA should just do this themselves if they're not going to support the full yaml spec in their custom parser.

DanySK commented 1 year ago

@Fitzsimmons this is actually a good idea...

CvH commented 1 year ago

@Fitzsimmons do you have some example at hand? looks like native anchors are not coming soon to GHA so this workaround should come handy πŸ‘

fabceolin commented 1 year ago

@CvH you need to create a pre-commit script to explode each yml file before commit.

cat example.yml.before_exploded | yq 'explode(.)' >example.yml
krancour commented 1 year ago

I want YAML anchors in my actions bad, but...

One hilarious thing is that they went on using "on" as key, when almost anyone that ever used YAML knows that this loads as "true" and not a string

@ssbarnea and that would be a barrier to getting a better parser... it's pretty hard to know how many project might exist that depend on that improper handling of "on."

This is why we can't have nice things.

DanySK commented 1 year ago

@krancour it should not be that big of an issue actually; it is sufficient to redirect on, true and any other representation of truth and 'on' on the same code path.

ssbarnea commented 1 year ago

TBH, on was more of a mistake in old YAML specification which was sorted by YAML 1.2 but the funny bit is that current implementation does not respect any of existing specifications.

mbmccoy commented 1 year ago

Can I request that this situation be documented (and, ideally, better error messages surfaced)? In short, all of the documents suggest that "yaml syntax is supported", but in fact, only a subset of valid yaml is supported.

I've run in to this a few times now, and while I keep getting faster at figuring out why my workflows aren't running, it would be helpful to at least have it documented.

bwoodsend commented 1 year ago

From day one it was known that yaml supports anchors for reducing duplication, and that feature is not an unsafe one.

Whilst it's not on the same league as the shell execution *ahem* "feature?", it's not harmless. You can get a pretty easy denial of service attack out of it similar to XML's billion laughs attack by nesting aliases and letting some GitHub server explode it. Something like a scaled up version of:


- &a0
    - &a1
        - &a2
            - &a3
                - hah hah hah
                - hah hah hah
                - hah hah hah
            - *a3
            - *a3
        - *a2
        - *a2
    - *a1
    - *a1
- *a0
- *a0
DanySK commented 1 year ago

Solvable by limiting the maximum number of items reasonably, as most YAML parsers already do.

tshakah commented 1 year ago

Composite actions and reusable workflows are just bad workarounds for the lack of this feature

merusso commented 1 year ago

Composite actions and reusable workflows are just bad workarounds for the lack of this feature

I disagree. Actions and workflows are separate files that can be referenced by multiple workflows in the same or a separate repository. YAML anchors are only useful within a single file. So actions and workflows are better designed for broader reuse.

Also, even if YAML anchor support is added, workflow syntax isn't designed to make great use of them. YAML anchors need to be defined above the YAML aliases that reference them. So ideally you want a "definition" section at the top of a file that defines the anchors, and you'd use aliases below to reference these anchors in the job definitions.

Bitbucket and GitLab pipelines are designed with anchors in mind and have definitions sections where anchors are encouraged. See:

DanySK commented 1 year ago

@merusso I agree that anchors won't make actions and reusable workflow useless, but they would still help apply the DRY principle within complex workflows. I recently had to modify a boolean condition that applied to several steps and jobs. I had to do 6/7 changes with cut/paste. Terrible and error-prone.

GitHub Actions feels in many ways to me like something built bottom-up without an appropriate analysis of the domain. Besides the lack of standard YAML features, it is also noticeable in the expression language.

merusso commented 1 year ago

To downvoters of my comment above, please share why you disagree with my comment. I believe these are facts:

  1. YAML anchors/aliases work within a single file. They don't enable reuse between files or repos.
  2. Actions are defined in separate files. They do enable reuse between files or repos.

Please note: I'm not saying YAML anchors are useless. I'm simply noting their limitations.