puppetlabs / cat-github-actions

1 stars 5 forks source link

CAT-1918 : Adding check for version from metadata.json #98

Closed malikparvez closed 3 months ago

malikparvez commented 3 months ago

Summary

This PR introduces a new check to ensure that only versions containing numerical values are considered for inclusion in the release process. The implementation involves retrieving the version information from the metadata.json file and validating it to contain only numeric characters. This enhancement aims to improve the reliability and consistency of the release workflow by preventing the inclusion of invalid or malformed version identifiers. Also regex is considered from this reference metadata-json-lint

Additional Context

Add any additional context about the problem here.

Related Issues (if any)

Mention any related issues or pull requests.

Checklist

malikparvez commented 3 months ago

I believe module version regex is defined https://github.com/voxpupuli/metadata-json-lint/blob/b5e68049c6be58aa63263357bb0dcad8635a6829/lib/metadata-json-lint/schema.rb#L141-L150

About above link yes we have already define versions for module but never know what might a malicious user will put in code...

Couple notes:

  1. Does ubuntu-latest have a ruby or python interpreter in its default env? Computing the version with shell code is probably just going to be harder than it needs to be assuming it is easy to access a more advanced scripting language. I think you should consider using something like that if possible.

I'm curious about how using a high-level language would help in this case. Even if we use a high-level language like Ruby or Python, we would still end up executing it as a shell command (e.g., ruby ./<path to ruby/python script>). Or do you have a different approach in mind?

BTW the link which you shared above (https://github.com/voxpupuli/metadata-json-lint/blob/b5e68049c6be58aa63263357bb0dcad8635a6829/lib/metadata-json-lint/schema.rb#L141-L150) to fine the version which is considering the prerelease as well as build but we haven't come across such pattern for our modules as of now, so not sure if those expressions are the perfect for our use-case.

  1. I'm fine with this being a separate step still, but i think it was only split out before for code reuse.

Hi @donoghuc, Thanks for your review. Wanted to explore your point 1 in details so have put my reply inline. If you can review and help me on the same to take it forward?

malikparvez commented 3 months ago

Also @donoghuc lets not over complicate the simple action most of the github actions are already used in such a way only, same thing is being done in gem release also gem_release_prep

donoghuc commented 3 months ago

I believe module version regex is defined https://github.com/voxpupuli/metadata-json-lint/blob/b5e68049c6be58aa63263357bb0dcad8635a6829/lib/metadata-json-lint/schema.rb#L141-L150 About above link yes we have already define versions for module but never know what might a malicious user will put in code...

I pointed that out to say you should enforce that regex. The one you have here is insufficient for some of those valid formats. What do you mean by "never know what might a malicious user will put in code..."? This PR should explicitly validating that the data in version metadata matches the defined schema to protect against putting shell commands into that field which would be injected in the next step of this workflow.

I recommended using ruby or python for cleaning this input so you could just copy the logic from the schema instead of using using jq and bash as I find those to be easier to read and maintain. I dont feel strongly about what you use to do the validation, just that the validation actually enforces the schema we expect.

Also @donoghuc lets not over complicate the simple action most of the github actions are already used in such a way only, same thing is being done in gem release also gem_release_prep

We need to do this validation according to the module spec, the gem spec is different

malikparvez commented 3 months ago

@donoghuc As per the suggestion used ruby to resolve version tough I don't like such long like another way would be to add a script in repo and call it. But I believe this should suffice

malikparvez commented 3 months ago

@donoghuc accomodated your suggestions feel free to review and merge...

donoghuc commented 3 months ago

@malikparvez before this is merged the commits should be squashed into one with a detailed write up of what the change is and where the validation schema comes from in the commit message.

malikparvez commented 3 months ago

@malikparvez before this is merged the commits should be squashed into one with a detailed write up of what the change is and where the validation schema comes from in the commit message.

Already had updated the PR description, let me rebase it...

donoghuc commented 3 months ago

We need the information in the commit message so that it is part of the git history. That is how we can track changes to a project over time. The PR description is not in the git history, it is only on the github application.

malikparvez commented 3 months ago

We need the information in the commit message so that it is part of the git history. That is how we can track changes to a project over time. The PR description is not in the git history, it is only on the github application.

Well I usually like to keep the commit message short and crisp, further reference can always be traced wrt PR...

donoghuc commented 3 months ago

Vague commit messages are not acceptable. Looking at the history of this project it has been very difficult to understand the changes and how we got to this place. We need to be better at describing behavior changes and organizing those into logical commits so that when we audit issues like this we have justification and descriptions for critical changes.

malikparvez commented 3 months ago

@donoghuc also added the details in the commit message, lets merge this one and thanks for your guidance on this from the beginning...