galaxyproject / galaxy-helm

Minimal setup required to run Galaxy under Kubernetes
MIT License
38 stars 36 forks source link

fix non-yaml config template set via values #432

Closed yegortokmakov closed 1 year ago

yegortokmakov commented 1 year ago

Because toYaml function was applied to all of the configs set in values, non-yaml files passed as a string produced manifests with invalid yaml. In this patch I change the behavior to only convert to yaml data intended for files ending in ".yml".

Reproduce:

Add non-Yaml, e.g. XML, file to values.configs. For example:

diff --git a/galaxy/values.yaml b/galaxy/values.yaml
index 6b600b6..12d6875 100644
--- a/galaxy/values.yaml
+++ b/galaxy/values.yaml
@@ -492,6 +492,7 @@ configs:
     {{- (.Files.Get "files/configs/integrated_tool_panel.xml") }}
   tool_conf.xml: |
     {{- (.Files.Get "files/configs/tool_conf.xml") }}
+  "auth_conf.xml": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<auth>\n   <authenticator>\n      <type>localdb</type>\n      <options>\n         <allow-password-change>true</allow-password-change>\n      </options>\n   </authenticator>\n</auth>"

 # Additional dynamic rules to map into the container.
 jobs:

Without the patch the template will produce invalid YAML (note the extra "-"):

# Source: galaxy/templates/configs-galaxy.yaml
apiVersion: v1
metadata:
  name: release-name-galaxy-configs
  labels:
    helm.sh/chart: galaxy-5.7.1
    app.kubernetes.io/name: galaxy
    app.kubernetes.io/instance: release-name
    app.kubernetes.io/version: "23.0"
    app.kubernetes.io/managed-by: Helm
kind: ConfigMap
data:
  auth_conf.xml: |
    -
      <?xml version="1.0" encoding="UTF-8"?>
      <auth>
         <authenticator>
            <type>localdb</type>
            <options>
               <allow-password-change>true</allow-password-change>
            </options>
         </authenticator>
      </auth>

Testing:

To test the patch, I have generated all templates before and after the change:

  1. helm template --debug -s templates/configs-galaxy.yaml . > before_change.txt
  2. Apply the patch
  3. helm template --debug -s templates/configs-galaxy.yaml . > after_change.txt
  4. ~/projects/galaxy-helm/galaxy $ diff before_change.txt after_change.txt. Only unnecessary indentation is gone, otherwise configs have the same content and are valid:
37,40c37,41
<       <containers_resolvers>
<         <explicit />
<         <mulled />
<       </containers_resolvers>
---
>     <containers_resolvers>
>       <explicit />
>       <mulled />
>     </containers_resolvers>
>     
211a213
>     
253,271c255,274
<       toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.7+galaxy0
<       toolshed.g2.bx.psu.edu/repos/bgruening/pharmcat/pharmcat/1.3.1+galaxy0
<       toolshed.g2.bx.psu.edu/repos/crs4/taxonomy_krona_chart/taxonomy_krona_chart/2.7.1+galaxy0
<       toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72+galaxy1
<       toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.73+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/dexseq/dexseq/1.28.1+galaxy2
<       toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy1
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy1
<       toolshed.g2.bx.psu.edu/repos/iuc/macs2/macs2_callpeak/2.1.1.20160309.6
<       toolshed.g2.bx.psu.edu/repos/iuc/meme_meme/meme_meme/5.0.5.0
<       toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.9+galaxy1
<       toolshed.g2.bx.psu.edu/repos/iuc/prestor_abseq3/prestor_abseq3/0.6.2+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy3
<       toolshed.g2.bx.psu.edu/repos/iuc/seurat/seurat/4.1.0+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/snpeff/snpEff/4.3+T.galaxy1
---
>     toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.7+galaxy0
>     toolshed.g2.bx.psu.edu/repos/bgruening/pharmcat/pharmcat/1.3.1+galaxy0
>     toolshed.g2.bx.psu.edu/repos/crs4/taxonomy_krona_chart/taxonomy_krona_chart/2.7.1+galaxy0
>     toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72+galaxy1
>     toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.73+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/dexseq/dexseq/1.28.1+galaxy2
>     toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy1
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy1
>     toolshed.g2.bx.psu.edu/repos/iuc/macs2/macs2_callpeak/2.1.1.20160309.6
>     toolshed.g2.bx.psu.edu/repos/iuc/meme_meme/meme_meme/5.0.5.0
>     toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.9+galaxy1
>     toolshed.g2.bx.psu.edu/repos/iuc/prestor_abseq3/prestor_abseq3/0.6.2+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy3
>     toolshed.g2.bx.psu.edu/repos/iuc/seurat/seurat/4.1.0+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/snpeff/snpEff/4.3+T.galaxy1
>     
418a422
>     
420,426c424,430
<       <?xml version="1.0"?>
<       <workflow_schedulers default="core">
<         <core id="core" />
<         <handlers default="schedulers">
<             <handler id="workflow_scheduler0" tags="schedulers"/>
<         </handlers>
<       </workflow_schedulers>
---
>     <?xml version="1.0"?>
>     <workflow_schedulers default="core">
>       <core id="core" />
>       <handlers default="schedulers">
>           <handler id="workflow_scheduler0" tags="schedulers"/>
>       </handlers>
>     </workflow_schedulers>

With the XML as a string in values.configs:

helm template --debug -s templates/configs-galaxy.yaml . > xmlstring_before_change.txt
helm template --debug -s templates/configs-galaxy.yaml . > xmlstring_after_change.txt
diff xmlstring_before_change.txt xmlstring_after_change.txt 

15,24c15,23
<     -
<       <?xml version="1.0" encoding="UTF-8"?>
<       <auth>
<          <authenticator>
<             <type>localdb</type>
<             <options>
<                <allow-password-change>true</allow-password-change>
<             </options>
<          </authenticator>
<       </auth>
---
>     <?xml version="1.0" encoding="UTF-8"?>
>     <auth>
>        <authenticator>
>           <type>localdb</type>
>           <options>
>              <allow-password-change>true</allow-password-change>
>           </options>
>        </authenticator>
>     </auth>

Unnecessary indentation and "-" is gone.

yegortokmakov commented 1 year ago

right, there are already XML files, but those are set directly in values or loaded with Helm function. the problem comes when you try to set XML file contents and process it with toYaml - in this case Helm treats the string as an array element for some reason

ksuderman commented 1 year ago

I am a little unclear what you mean by,

the problem comes when you try to set XML file contents and process it with toYaml

Isn't this exactly what is happening on the line Nuwan linked to above? Those are XML blocks included in the values.yaml file and run through toYaml by the configs-galaxy.yaml template.

However, after playing around some, I wonder if there is a bug (or undocumented feature) in Helm's toYaml processor? I can get rid of the array (unwanted dash character) by using a block scalar for the XML content, which is what the other XML entries do:

  auth_conf.xml: |
    "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<auth>\n   <authenticator>\n      <type>localdb</type>\n      <options>\n         <allow-password-change>true</allow-password-change>\n      </options>\n   </authenticator>\n</auth>"

After a bit more Googling, I think what is happening is that in the original example, Yegor uses a mutli-line string (the string contains newline characters and is therefore multi-line) and this is tripping up toYaml and it figures the content must be an array. By specifically using a block scalar that problem is avoided.

yegortokmakov commented 1 year ago

After a bit more Googling, I think what is happening is that in the original example, Yegor uses a mutli-line string (the string contains newline characters and is therefore multi-line) and this is tripping up toYaml and it figures the content must be an array. By specifically using a block scalar that problem is avoided.

Yes, I think that is what happening. In our case we are setting chart values dynamically and rely on the templates/configs-galaxy.yaml and toYaml to convert the string correctly. For sure removing all newlines from the input will fix the issue, but Imho proposed solution is a bit cleaner (toYaml shouldn't be applied on not Yaml).

ksuderman commented 1 year ago

toYaml shouldn't be applied on not Yaml.

There is definitely something weird going on, but I don't think it is directly related to toYaml. My understanding is that since Helm uses Go templates, everything is a Go object by this point (after parsing), hence the need for toYaml to generate output. Also this fix doesn't take into account Yaml files with a .yaml suffix (easy fix) or Yaml files with no suffix (bad practice, but never underestimate what users may do).

When you say you set the value dynamically, how are you setting the value? Do you use --set, --set-file, modify the values.yaml file, use a second values file included with --values, or something else?

I am still puzzled, and trying to determine, why this bit of XML is handled differently than the other bits of XML already included. Also why I see different behaviour if the XML is included directly in the values.yaml file or included via a second values file.

yegortokmakov commented 1 year ago

toYaml shouldn't be applied on not Yaml.

There is definitely something weird going on, but I don't think it is directly related to toYaml. My understanding is that since Helm uses Go templates, everything is a Go object by this point (after parsing), hence the need for toYaml to generate output.

Absolutely, yes, if it's an object. If it's a multiline string toYaml will produce an array-like yaml. It might be a bug in toYaml, but still we don't need to apply to Yaml on non-yaml.

Also this fix doesn't take into account Yaml files with a .yaml suffix (easy fix) or Yaml files with no suffix (bad practice, but never underestimate what users may do).

Added check for both .yml and .yaml. Imho edge case with filenames without an extension is too uncommon to account for.

When you say you set the value dynamically, how are you setting the value? Do you use --set, --set-file, modify the values.yaml file, use a second values file included with --values, or something else?

We use AWS CDK to deploy the helm. Values are set from Typescript and some config files are read directly from files like this:

    const galaxyChart = new eks.HelmChart(this, 'galaxyChart', {
      chart: 'galaxy',
      release: 'galaxy',
      repository: 'https://raw.githubusercontent.com/CloudVE/helm-charts/master/',
      values: {
        configs: {
          "someconfig.yml": someconfigContent,
        }
      }
   }

someconfigContent in some cases will be a multiline string. During deployment, CDK generates a values.yaml file that will contain passed config for "someconfig.yaml":

"auth_conf.xml": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<auth>\n   <authenticator>\n      <type>localdb</type>\n      <options>\n         <allow-password-change>true</allow-password-change>\n      </options>\n   </authenticator>\n</auth>"

This is an absolutely valid YAML and is parsed correctly as a string.

I am still puzzled, and trying to determine, why this bit of XML is handled differently than the other bits of XML already included. Also why I see different behaviour if the XML is included directly in the values.yaml file or included via a second values file.

Currently, as you rightly mentioned earlier, you are using a block scalar:

  tool_conf.xml: |
    {{- (.Files.Get "files/configs/tool_conf.xml") }}

It forces yaml parser to ignore \n line breaks in the text and is not confusing toYaml to think it is an array. This is a perfect workaround, but only works for values you set in values.yaml directly or rendered with a template.

In our case, we can remove new lines in file content, but the proposed change seems like a much cleaner solution. In this case it seems like removing new lines in configs-galaxy.yaml is not needed as well.

yegortokmakov commented 1 year ago

hi @ksuderman , what do you think about my comment above? I also pushed some new changes after testing it again

ksuderman commented 1 year ago

Hi @yegortokmakov I was out last week so I am just seeing this now. Your changes look good, and I agree that Yaml without a proper suffix is not worth worring about. When I said I was wondering why the XML was handled differently I was referring to XML like this. If I modify the helm chart to include your XML directly it works as expected. If I put the XML in a separate values file and include it with:

helm install galaxy galaxy/galaxy --values auth_conf.yml

I see the incorrect output generated. I will discuss with the rest of the Kubernetes team and hopefully we can get this resolved ASAP for you.

yegortokmakov commented 1 year ago

thanks!

nuwang commented 1 year ago

We spent a bit of time discussing options and ultimately settled on this one. Let us know whether this works for you.

nuwang commented 1 year ago

Noting this for additional context if we ever need to dig up the underlying causes again: toYaml is used in our Helm templates because we need to convert the input object into a YAML-formatted string. Input objects can be complex yaml objects themselves (as in the case of galaxy.yml) or simple string objects. In either case, toYaml generally generates the correct serialized representation, until you run into the case above, which is a simple string object containing newline characters. In this case, toYaml Helm will treat it as a multi-line string and add a block chomping indicator, which we don't really want there.

We never ran into this problem because we either used a block scalar, or the simple string values we used did not contain new lines.

yegortokmakov commented 1 year ago

Sorry for the slow response. Perfect, even better solution! 👍

I've run the tests I had before and the patch works as expected:

 $ diff before_change.txt after_change.txt                                    
15,24c15,23
<     -
<       <?xml version="1.0" encoding="UTF-8"?>
<       <auth>
<          <authenticator>
<             <type>localdb</type>
<             <options>
<                <allow-password-change>true</allow-password-change>
<             </options>
<          </authenticator>
<       </auth>
---
>     <?xml version="1.0" encoding="UTF-8"?>
>     <auth>
>        <authenticator>
>           <type>localdb</type>
>           <options>
>              <allow-password-change>true</allow-password-change>
>           </options>
>        </authenticator>
>     </auth>
48,51c47,51
<       <containers_resolvers>
<         <explicit />
<         <mulled />
<       </containers_resolvers>
---
>     <containers_resolvers>
>       <explicit />
>       <mulled />
>     </containers_resolvers>
>     
222a223
>     
264,282c265,284
<       toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.7+galaxy0
<       toolshed.g2.bx.psu.edu/repos/bgruening/pharmcat/pharmcat/1.3.1+galaxy0
<       toolshed.g2.bx.psu.edu/repos/crs4/taxonomy_krona_chart/taxonomy_krona_chart/2.7.1+galaxy0
<       toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72+galaxy1
<       toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.73+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/dexseq/dexseq/1.28.1+galaxy2
<       toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy1
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy1
<       toolshed.g2.bx.psu.edu/repos/iuc/macs2/macs2_callpeak/2.1.1.20160309.6
<       toolshed.g2.bx.psu.edu/repos/iuc/meme_meme/meme_meme/5.0.5.0
<       toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.9+galaxy1
<       toolshed.g2.bx.psu.edu/repos/iuc/prestor_abseq3/prestor_abseq3/0.6.2+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy3
<       toolshed.g2.bx.psu.edu/repos/iuc/seurat/seurat/4.1.0+galaxy0
<       toolshed.g2.bx.psu.edu/repos/iuc/snpeff/snpEff/4.3+T.galaxy1
---
>     toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.7+galaxy0
>     toolshed.g2.bx.psu.edu/repos/bgruening/pharmcat/pharmcat/1.3.1+galaxy0
>     toolshed.g2.bx.psu.edu/repos/crs4/taxonomy_krona_chart/taxonomy_krona_chart/2.7.1+galaxy0
>     toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72+galaxy1
>     toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.73+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/dexseq/dexseq/1.28.1+galaxy2
>     toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse_to_standalone/1.16.11+galaxy1
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/jbrowse/1.16.11+galaxy1
>     toolshed.g2.bx.psu.edu/repos/iuc/macs2/macs2_callpeak/2.1.1.20160309.6
>     toolshed.g2.bx.psu.edu/repos/iuc/meme_meme/meme_meme/5.0.5.0
>     toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.9+galaxy1
>     toolshed.g2.bx.psu.edu/repos/iuc/prestor_abseq3/prestor_abseq3/0.6.2+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy3
>     toolshed.g2.bx.psu.edu/repos/iuc/seurat/seurat/4.1.0+galaxy0
>     toolshed.g2.bx.psu.edu/repos/iuc/snpeff/snpEff/4.3+T.galaxy1
>     
429a432
>     
431,437c434,440
<       <?xml version="1.0"?>
<       <workflow_schedulers default="core">
<         <core id="core" />
<         <handlers default="schedulers">
<             <handler id="workflow_scheduler0" tags="schedulers"/>
<         </handlers>
<       </workflow_schedulers>
---
>     <?xml version="1.0"?>
>     <workflow_schedulers default="core">
>       <core id="core" />
>       <handlers default="schedulers">
>           <handler id="workflow_scheduler0" tags="schedulers"/>
>       </handlers>
>     </workflow_schedulers>
nuwang commented 1 year ago

Great! Thanks very much for initiating this fix!

yegortokmakov commented 1 year ago

thank you @nuwang and @ksuderman for deep diving on this!