mrbaseman / parse_yaml

a simple yaml parser implemented in bash
GNU General Public License v3.0
161 stars 37 forks source link

convert parsed yaml file outputs to azurepipeline variables #21

Closed vivuu1989 closed 7 months ago

vivuu1989 commented 8 months ago

We tried to use this script to parse our Azurepolicy yaml inputs and need to process these parsed output further in our azurepipeline to have various policy file creation depend on these variable generated. We need to have many conditions in our pipeline task further to create a custom policy file based on the inputs given for (polcyname, type, session etc..). But by using the script, we are not getting a way to have conditions applied dynamically based on the variables generated.

So in brief , our issue is how we can enable the conditions for these generated outputs ( there might be one more policy for each apps, and this conditions need to be applied dynamically) or use all the generated outputs to pipeline variable to use them in further in the task

################ Policy ################

- name: plicyA
  session: 
     - inbound
     - backend
  scope: api
  apiname:
  customvalue1: xxxxx
  customvalue2: xxxxx

- name: policyB
  scope: operation
  operation:
      - operation1
      - operation2
  session: 
     - inbound
     - backend
  customvalue3: xxxxx
  customvalue4: xxxxx

etc.....................................

pipeline file is

stages:
- stage: check
  displayName: 'check'
  variables:
    subscription: 'xxxxxxxxxxxxxxx'
  pool:
      name: xxxxx
  jobs:
  - job: api
    displayName: 'api policy'
    variables:
    - group: check_api_policy
    workspace:
      clean: all
    pool:
      name: xxxxxxx
    steps:
    - bash: |      
         source $(System.DefaultWorkingDirectory)/scripts/parse_yaml.sh
         parse_yaml $(System.DefaultWorkingDirectory)/api-ops.yaml policy
         echo "*****************************************"
         for f in $policy_ ; do eval echo \$f \$${f} ; done
         echo "*****************************************"

         #######################################
         convert all the above generated variables to azure pipeline variables
         can we dynamically apply conditions based on the variables generated from above
         #######################################
      name: Parse_yaml

Requirement

.

mrbaseman commented 8 months ago

I'm not familiar with azure pipelines, so maybe I didn't understand your question, but I would try something like

stages:
- stage: check
  displayName: 'check'
  variables:
    subscription: 'xxxxxxxxxxxxxxx'
  pool:
      name: xxxxx
  jobs:
  - job: api
    displayName: 'api policy'
    variables:
    - group: check_api_policy
    workspace:
      clean: all
    pool:
      name: xxxxxxx
    steps:
    - bash: |      
         source $(System.DefaultWorkingDirectory)/scripts/parse_yaml.sh
         eval $(parse_yaml $(System.DefaultWorkingDirectory)/api-ops.yaml policy)
         echo "*****************************************"
         for f in $policy_ ; do eval echo \$f \$${f}_ ; done
         echo "*****************************************"

         for f in $policy_ ; do 
         if [ $(eval echo \$${f}_scope)  = "api" ] ; then 
         echo "processing api scope";
         # ...
         elif [ $(eval echo \$${f}_scope)  = "operation" ] ; then
         echo "processing operation scope";
         # ....
         else echo "unknown scope $(eval echo \$${f}_scope)"
         fi 
         done

I'm not sure if this helps you, but I think the point is to call eval $(parse_yaml ...) in order to have the variables set in the shell instead of dealing with the output (maybe this is what you meant with "convert the variables to azure pipeline variables"?), and with them being set you can "dynamically apply conditions based on the variables" using if or case constructs of the bash and for example create different kinds of policy files

vivuu1989 commented 8 months ago

I'm not familiar with azure pipelines, so maybe I didn't understand your question, but I would try something like

stages:
- stage: check
  displayName: 'check'
  variables:
    subscription: 'xxxxxxxxxxxxxxx'
  pool:
      name: xxxxx
  jobs:
  - job: api
    displayName: 'api policy'
    variables:
    - group: check_api_policy
    workspace:
      clean: all
    pool:
      name: xxxxxxx
    steps:
    - bash: |      
         source $(System.DefaultWorkingDirectory)/scripts/parse_yaml.sh
         eval $(parse_yaml $(System.DefaultWorkingDirectory)/api-ops.yaml policy)
         echo "*****************************************"
         for f in $policy_ ; do eval echo \$f \$${f}_ ; done
         echo "*****************************************"

         for f in $policy_ ; do 
         if [ $(eval echo \$${f}_scope)  = "api" ] ; then 
         echo "processing api scope";
         # ...
         elif [ $(eval echo \$${f}_scope)  = "operation" ] ; then
         echo "processing operation scope";
         # ....
         else echo "unknown scope $(eval echo \$${f}_scope)"
         fi 
         done

I'm not sure if this helps you, but I think the point is to call eval $(parse_yaml ...) in order to have the variables set in the shell instead of dealing with the output (maybe this is what you meant with "convert the variables to azure pipeline variables"?), and with them being set you can "dynamically apply conditions based on the variables" using if or case constructs of the bash and for example create different kinds of policy files

The Above solution worked for me to get the variables from the keyvalues pair yaml inputs as you give the example. But not getting a way to assign conditions on the dictionary values inside each session above. For example, in each policy list above, eg: policy A, i have 2 sessions objects values as "inbound and backend", and based on these 2 values, i have to write condition again for policy A, and similarly for the list of policies dynamically

vivuu1989 commented 8 months ago

@mrbaseman Is there any way to call the variables inside each policies for example want to make a condition based on the session of each policy

mrbaseman commented 8 months ago

On each level you have a list of keys which you can feed into a loop and check for the content of the variables. For the above example you have

policy1="name: policyA"
policy1_session__1="inbound"
policy1_session__2="backend"
policy1_scope="api"
policy1_customvalue1="xxxxx"
policy1_customvalue2="xxxxx"
policy2="name: policyB"
policy2_scope="operation"
policy2_operation__1="operation1"
policy2_operation__2="operation2"
policy2_session__1="inbound"
policy2_session__2="backend"
policy2_customvalue3="xxxxx"
policy2_customvalue4="xxxxx"
policy2_session__=" policy2_session__1 policy2_session__2"
policy1_session__=" policy1_session__1 policy1_session__2"
policy2_operation__=" policy2_operation__1 policy2_operation__2"
policy1_=" policy1_session policy1_scope policy1_apiname policy1_customvalue1 policy1_customvalue2"
policy2_=" policy2_scope policy2_operation policy2_session policy2_customvalue3 policy2_customvalue4"
policy_=" policy1 policy2"

so, when you process policy1, you can check in $policy1_session__ which variables exist, and then you can check their value.

You might want to try out dictionaries instead of lists, so that you know which key refers to which setting, i. e. something like session: { inbound: true, backend: true }. If you can change the structure of the yaml file, ask the question how it should be structured best to make the processing easy. If the structure is predefined by other dependencies, ask the question which parts are always there, so that you can rely on them, which ones are variable and which ones might be missing.

Also I notice that things get quite convoluted in your case. This project provides a simple yaml parser implemented as a shell function which allows you to populate shell variables with the content of the yaml file. The subsequent processing of these variables is up to you. However, you call the bash out of the yaml file and try to execute nontrivial steps during parsing and depending on the content of another yaml file. Also here I would recommend to do one step back and ask the question if the processing of the yaml content is done at the right place, or if you should better do the parsing here and postpone the processing to a later stage - and maybe calling eval on the parsed output is not the best choice in your use case, maybe it's better to pipe the output into another script that does the evaluation?

vivuu1989 commented 8 months ago

@mrbaseman , sorry if i confused in my above question as i am not familiar with shell scripting that much. Actually your script is working for us to parse our input policy yaml file, but where we are facing challenge, is we are not able to dynamically loop the policies properties such as session value and operation value to add condition based on them. is there any way to dynamically or loop through each policy variable properties and generate its value to make condition accordingly ? In our scenario, the policy list can be n numbers and in each policies there can be multiple sessions and operations also can be list . So our requirement is to create custom policy file related to each of these properties of each policy and apply in its scope.

I tried below script, but failed

for f in $policy_ ; do echo "value of f is $f" if [ $(eval echo \$${f}_name) = "ipfilter" ] ; then echo "the policy name is '$(eval echo \$${f}_name)'" for s in "$${f}_session" ; do echo "the policy sesion is '$(eval echo \$${s})'" done fi done

mrbaseman commented 8 months ago

So if you have lists on several levels you need several nested loops over the list entries, something like:

for f in $policy_ ; do 
    for g in $(eval echo \$${f}_session__ ); do 
        if [ $(eval echo \$${g}) = "inbound" ]; then 
        echo for $f session $g is inbound; 
    elif [ $(eval echo \$${g}) = "backend" ] ; then 
        echo for $f session $g is backend; 
    fi; 
    done; 
done

I guess it's still an academic example code, but I hope we are approaching what you would like to achieve ;-)

vivuu1989 commented 8 months ago

@mrbaseman Thanks for your suggestion. Is it possible to escape special characters multi line string in the input value of key? in my input value of a key, the string, contains parentheses "()" and When I do the "eval" parse command . its giving error as "-bash: syntax error near unexpected token '(' "

mrbaseman commented 8 months ago

The keys of key/value pairs can't contain special characters. I think it wouldn't violate the yaml standard, but in this implementation I have assumed the keys consist of alphanumerical characters. Since the keys become shell variables this limitation makes sense in my opinion (I have added this limitation to the README.md file just now). I would recommend to preprocess yaml files that contain parentheses in the keys, before parsing them with parse_yaml. I know this is not very elegant, but it's simply not possible to map all flavors of yaml to shell variables. The values can contain special characters, like parentheses for instance, and they should already be enclosed in quotes, tho that no further escaping should be necessary. However, if you have a yaml file with parentheses in the values, which causes the syntax error when the output of parse_yaml is treated with eval, then I would consider this as a bug. If you could provide the yaml file, I can have a look and try to find a solution.

vivuu1989 commented 8 months ago

@mrbaseman , Yes.. I am getting "eval" error, when the input values having value like below

- name: poicyA
  scope: api
  api1:
    session: 
     - inbound: <rate-limit-by-key calls="xx"     renewal-period="60"    counter-key="@(context.Request.Headers.GetValueOrDefault("Authorization","").AsJwt()?.Subject)" /> 
     - backend: <rate-limit-by-key calls="yyy"     renewal-period="60"    counter-key="@(context.Request.Headers.GetValueOrDefault("Authorization","").AsJwt()?.Subject)" />
     - outbound: <rate-limit-by-key calls="zz"     renewal-period="60"    counter-key="@(context.Request.Headers.GetValueOrDefault("Authorization","").AsJwt()?.Subject)" />
mrbaseman commented 8 months ago

ok, so the parentheses are in the values. The keys are "inbound", "backend", and "outbound", and everything after the colon is the value. I don't get the syntax error, but I notice that the values are not parsed correctly. At least the ">" is stripped off when I parse the example file under linux. I need a closer look what's going wrong in that case.

mrbaseman commented 8 months ago

The original question was, if the double quotes (") can be escaped in the parsed output. That's a good point, but difficult to address in general. However, if we enclose the parsed values in single quotes, it becomes much easier to handle on the shell. It partly shifts the problem that we have to protect single quotes then, but other special characters like $ for instance, can't have surprising effects anymore. I have addressed this in commit eed90b239c370e6c910b123022925cc5f6cba0b9.

One problem with the above example is that > at the end of a line marks the beginning of block folding in yaml. So, the values should be enclosed in single quotes to avoid the special meaning of the > character.

I couldn't reproduce the syntax error "when doing the eval()" - probably I'm just not doing the exact same thing as you do with eval() ;) but maybe this was a consecutive error which goes away with the latest commit that improves the quoting

vivuu1989 commented 8 months ago

@mrbaseman Great. That worked like a champ!!.. and finally, is it possible to input a xml element structure content (with multiple lines) as value to the key and the parsed variable output will get the value as same xml structure without adding any additional characters ? for Example:

 name: poicyA
  scope: api
  api1:
    session: 
     - inbound: |
            <ip-filter action="allow | forbid">
                <address>address</address>
               <address-range from="address" to="address" />
            </ip-filter>
    - onerror: |
        <set-header name="ErrorSource" exists-action="override">
            <value>@(context.LastError.Source)</value>
        </set-header>
        <set-header name="ErrorReason" exists-action="override">
            <value>@(context.LastError.Reason)</value>
        </set-header>
        <set-header name="ErrorMessage" exists-action="override">
            <value>@(context.LastError.Message)</value>
        </set-header>
        <set-header name="ErrorScope" exists-action="override">
            <value>@(context.LastError.Scope)</value>
        </set-header>
        <set-header name="ErrorSection" exists-action="override">
            <value>@(context.LastError.Section)</value>
        </set-header>
        <set-header name="ErrorPath" exists-action="override">
            <value>@(context.LastError.Path)</value>
        </set-header>
        <set-header name="ErrorPolicyId" exists-action="override">
            <value>@(context.LastError.PolicyId)</value>
        </set-header>
        <set-header name="ErrorStatusCode" exists-action="override">
            <value>@(context.Response.StatusCode.ToString())</value>
        </set-header>
mrbaseman commented 8 months ago

I have copy&pasted this example into a file policyA.yml and experimented with different variants of how newlines could be handled differently. Unfortunately, even if the parsed output looked somehow like

name_name='poicyA'
name_scope='api'
name_api1_session_inbound='<ip-filter action="allow | forbid">
         <address>address</address>
        <address-range from="address" to="address" />
     </ip-filter>
'

when feeding this into the eval command, the newlines were gone. So, it's not exactly the literal input. However, if you use printf to process the content stored in the shell variable, I guess you get what you expect:

$ eval $(parse_yaml policyA.yml)
$ printf "$name_api1_session_inbound"
<ip-filter action="allow | forbid">
 <address>address</address>
 <address-range from="address" to="address" />
 </ip-filter>
$ printf "$name_api1_session_onerror"
<set-header name="ErrorSource" exists-action="override">
 <value>@(context.LastError.Source)</value>
 </set-header>
 <set-header name="ErrorReason" exists-action="override">
 <value>@(context.LastError.Reason)</value>
 </set-header>
 <set-header name="ErrorMessage" exists-action="override">
 <value>@(context.LastError.Message)</value>
 </set-header>
 <set-header name="ErrorScope" exists-action="override">
 <value>@(context.LastError.Scope)</value>
 </set-header>
 <set-header name="ErrorSection" exists-action="override">
 <value>@(context.LastError.Section)</value>
 </set-header>
 <set-header name="ErrorPath" exists-action="override">
 <value>@(context.LastError.Path)</value>
 </set-header>
 <set-header name="ErrorPolicyId" exists-action="override">
 <value>@(context.LastError.PolicyId)</value>
 </set-header>
 <set-header name="ErrorStatusCode" exists-action="override">
 <value>@(context.Response.StatusCode.ToString())</value>
 </set-header>
$ 

and if you assign this output to another variable as follows, you can use echo again:

$ A=$(printf "$name_api1_session_inbound")
$ echo "$A"
<ip-filter action="allow | forbid">
 <address>address</address>
 <address-range from="address" to="address" />
 </ip-filter>
$

or, you could postprocess the parsed output and replace all occurrences of \n by the newline character

$ eval $(parse_yaml policyA.yml | sed 's#\\n#\n#g')

but, as I said, the newline characters get lost when this is handed over to eval, but xml shouldn't care about different whitespace characters (newlines versus spaces) between the tags...

PS: I needed to append an empty line (or a line containing just a comment) to terminate the literal input (which starts with the | sign). I'm not sure if this is strictly required by the yaml standard, or if the end of file would be sufficient here.