puppetlabs / pdk-templates

The main template repo for the Puppet Development Kit https://github.com/puppetlabs/pdk
Apache License 2.0
46 stars 143 forks source link

YAML anchor in .sync.yml results in bad result #459

Open cdenneen opened 2 years ago

cdenneen commented 2 years ago

pdk#main


Results in:

```yaml
---
stages:
  - syntax
  - unit
  - acceptance
variables:
  CI_REGISTRY: 'private.jfrog.io'
  PDK_FEATURE_FLAGS: 'controlrepo'

default:
  image: private.jfrog.io/docker/ap/pdk-docker:latest

  cache:
    paths:
      - /root/.pdk/cache

".validate_template":
  stage: syntax
  script:
  - pdk validate
  only: &1
    refs:
    - branches
  except: &2
    variables:
    - "$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"
".spec_template":
  stage: unit
  script:
  - pdk test unit --parallel
  only: &3
    refs:
    - branches
  except: &4
    variables:
    - "$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"
pdk validate -Puppet ~> 7:
  stage: syntax
  script:
  - pdk validate
  only: *1
  except: *2
  variables:
    PDK_PUPPET_VERSION: '7'
pdk test unit-Puppet ~> 6:
  stage: unit
  script:
  - pdk test unit --parallel
  only: *3
  except: *4
  variables:
    PDK_PUPPET_VERSION: '6'
pdk test unit-Puppet ~> 7:
  stage: unit
  script:
  - pdk test unit --parallel
  only: *3
  except: *4
  variables:
    PDK_PUPPET_VERSION: '7'

when it should result in:

---
stages:
  - syntax
  - unit
  - acceptance
variables:
  CI_REGISTRY: 'private.jfrog.io'
  PDK_FEATURE_FLAGS: 'controlrepo'

default:
  image: private.jfrog.io/docker/ap/pdk-docker:latest

  cache:
    paths:
      - /root/.pdk/cache

.validate_template: &validate_config
  stage: syntax
  script:
  - pdk validate
  only:
    refs:
    - branches
  except:
    variables:
    - "$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"
.spec_template: &spec_config
  stage: unit
  script:
  - pdk test unit --parallel
  only:
    refs:
    - branches
  except:
    variables:
    - "$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"
pdk validate -Puppet ~> 7:
  <<: *validate_config
  variables:
    PDK_PUPPET_VERSION: '7'
pdk test unit-Puppet ~> 6:
  <<: *spec_config
  variables:
    PDK_PUPPET_VERSION: '6'
pdk test unit-Puppet ~> 7:
  <<: *spec_config
  variables:
    PDK_PUPPET_VERSION: '7'

Here is the diff of the result vs the desired result:

-".validate_template":
+.validate_template: &validate_config
   stage: syntax
   script:
   - pdk validate
-  only: &1
+  only:
     refs:
     - branches
-  except: &2
+  except:
     variables:
     - "$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"
-".spec_template":
+.spec_template: &spec_config
   stage: unit
   script:
   - pdk test unit --parallel
-  only: &3
+  only:
     refs:
     - branches
-  except: &4
+  except:
     variables:
     - "$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"
 pdk validate -Puppet ~> 7:
-  stage: syntax
-  script:
-  - pdk validate
-  only: *1
-  except: *2
+  <<: *validate_config
   variables:
     PDK_PUPPET_VERSION: '7'
 pdk test unit-Puppet ~> 6:
-  stage: unit
-  script:
-  - pdk test unit --parallel
-  only: *3
-  except: *4
+  <<: *spec_config
   variables:
     PDK_PUPPET_VERSION: '6'
 pdk test unit-Puppet ~> 7:
-  stage: unit
-  script:
-  - pdk test unit --parallel
-  only: *3
-  except: *4
+  <<: *spec_config
   variables:
     PDK_PUPPET_VERSION: '7'

The .gitlab-ci.yml.erb uses YAML.dump for this section which could be causing the issue?

<% if configs['custom_jobs'] -%>
<%=  YAML.dump(configs['custom_jobs']).sub(%r{\A---\r?\n}m, '') %>
<% end -%>

I've found something similar online about ignoring aliases: https://github.com/yaml/pyyaml/issues/103 but I'm not sure if that's possible here in an ERB template?

cdenneen commented 2 years ago

I believe this has something to do with yaml Dumper but I'm not 100% sure how to address it. (https://stackoverflow.com/questions/14478861/ruby-yaml-dump-how-to-prevent-alias-dereferencing?noredirect=1&lq=1)

sanfrancrisko commented 2 years ago

I think the anchors and aliases could have been lost by the initial YAML.safe_load of the .sync.yml. We set the aliases: true flag here when we call YAML.safe_load, but I'm not seeing them preserved. Looking around, I'm not sure it's expected that they are to be preserved, and it'll be on us to provide a custom parser to handle that.

I'll need to sync with the team on this one.

sanfrancrisko commented 2 years ago

So, @cdenneen , I think the only way we can do this is by making a code change to quite a critical part of the product (i.e. the loading/parsing of the .sync.yml), which would be quite a risky change at the best of times. Given our current resourcing issues, we're not sure we want to take that on, when there is a solution (albeit, not ideal) in place in the form of more copying/pasting of code.

Unless you're aware of a simpler way to implement the preservation of aliases, it seems like we'd need to go down the road of writing our own handler that can deal with this? Bearing in mind, anything that changes the behaviour of .sync.yml loading/parsing would be a risky change and require significant testing.

cdenneen commented 2 years ago

Could it be an “opt-in” to use different parser to fix the YAML issues. Then with enough time replace the current? So it wouldn’t affect existing unless you “opt-in” and wouldn’t replace until after some level of extensive testing has been run (or never becomes replacement just an “opt-in” via some parameter in the sync config or pdk config?

Is there any suggestions for the actual erb template? Maybe modify the YAML.load?

sanfrancrisko commented 2 years ago

If we were going down this road, it would definitely need to be opt-in

The issue is that we've lost the aliases data as YAML.safe_load is loaded in to a Ruby Hash object. So when we access configs['custom_jobs'], it's not there:

{".gitlab-ci.yml"=>
  {"override"=>true,
   "custom"=>
    {"image"=>"private.jfrog.io/docker/ap/pdk-docker:latest",
     "default_before_script"=>false,
     "cache"=>{"paths"=>["/root/.pdk/cache"]},
     "global_variables"=>
      {"CI_REGISTRY"=>"private.jfrog.io", "PDK_FEATURE_FLAGS"=>"controlrepo"},
     "custom_stages"=>["syntax", "unit", "acceptance"],
     "custom_jobs"=>
      {".validate_template"=>
        {"stage"=>"syntax",
         "script"=>["pdk validate"],
         "only"=>{"refs"=>["branches"]},
         "except"=>
          {"variables"=>["$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"]}},
       ".spec_template"=>
        {"stage"=>"unit",
         "script"=>["pdk test unit --parallel"],
         "only"=>{"refs"=>["branches"]},
         "except"=>
          {"variables"=>["$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"]}},
       "pdk validate -Puppet ~> 7"=>
        {"stage"=>"syntax",
         "script"=>["pdk validate"],
         "only"=>{"refs"=>["branches"]},
         "except"=>
          {"variables"=>["$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"]},
         "variables"=>{"PDK_PUPPET_VERSION"=>"7"}},
       "pdk test unit-Puppet ~> 6"=>
        {"stage"=>"unit",
         "script"=>["pdk test unit --parallel"],
         "only"=>{"refs"=>["branches"]},
         "except"=>
          {"variables"=>["$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"]},
         "variables"=>{"PDK_PUPPET_VERSION"=>"6"}},
       "pdk test unit-Puppet ~> 7"=>
        {"stage"=>"unit",
         "script"=>["pdk test unit --parallel"],
         "only"=>{"refs"=>["branches"]},
         "except"=>
          {"variables"=>["$CI_COMMIT_MESSAGE =~ /\\[skip[ _-]tests?\\]/i"]},
         "variables"=>{"PDK_PUPPET_VERSION"=>"7"}}}}}}
ardrigh commented 1 year ago

.gitlab-ci.yml:
  override: true
  custom: |
    <rest of YAML block>

Apologies for dragging up an old ticket, I was curious if using a YAML literal block style for the custom config would work here?