nienbo / cache-buildkite-plugin

Tarball, Rsync & S3 Cache Kit for Buildkite. Supports Linux, macOS and Windows
https://buildkite.com/plugins
MIT License
67 stars 39 forks source link

Using `{{ git.branch }}` cache-key with a `/` in the branch name #26

Closed Ben435 closed 3 years ago

Ben435 commented 3 years ago

Hi!

First off, love this plugin, thanks for all the hard work on it!

We've run into an issue recently using the new {{ git.branch }} cache key template, as some member's of my team use /'s in there git branch names.

Eg: fix/some-bug-we-found, feat/something-cool.

As using {{ git.branch }} means BUILDKITE_BRANCH is added directly into the CACHE_KEY, then via the S3 backend the ${CACHE_KEY}.tar is used as a mv target, we're getting errors like this in our builds:

mv: cannot move ‘/tmp/tmp.7G7Oz0l31L’ to ‘v1-cache-e3205e875e8a4ae8ed103242460f2bc0f41be2d8-fix/some-bug-we-found.tar’: No such file or directory

I imagine since its trying to copy into a non-existent sub-directory?

Example pipeline.yml that will produce this issue:

- gencer/cache#v2.4.6:
    backend: s3
    key: "v1-node-cache-{{ checksum 'yarn.lock' }}-{{ git.branch }}"
    s3:
        bucket: 'some-bucket-name'
    paths:
        - 'node_modules'

Happy to raise a PR to fix given some guidance on how we wanna fix this.

Thoughts?

Thanks!

gencer commented 3 years ago

Hi @Ben435!

Thanks for the kind words and happy to see this plugin works for you (except this issue :))

Two things came to my mind for this case:

  1. Drop anything except [a-zA-Z0-9] (or replace them with _)
  2. Replace only slashes /\ with _.

But I believe first one will be much better. In case any other special characters may be used in branch name. I'm not sure actually but I think I've also seen * and ' in branch names on some repositories (I might be wrong). That would also make cache fail.

Git imposes the following rules on how references are named:

They can include slash / for hierarchical (directory) grouping, but no slash-separated component can begin with a dot . or end with the sequence .lock.

They must contain at least one /. This enforces the presence of a category like heads/, tags/ etc. but the actual names are not restricted. If the --allow-onelevel option is used, this rule is waived.

They cannot have two consecutive dots .. anywhere.

They cannot have ASCII control characters (i.e. bytes whose values are lower than \040, or \177 DEL), space, tilde ~, caret ^, or colon : anywhere.

They cannot have question-mark ?, asterisk *, or open bracket [ anywhere. See the --refspec-pattern option below for an exception to this rule.

They cannot begin or end with a slash / or contain multiple consecutive slashes (see the --normalize option below for an exception to this rule)

They cannot end with a dot .

They cannot contain a sequence @{.

They cannot be the single character @.

They cannot contain a .

Basically, just replacing slashes with underscore will be more than enough, I believe.

PR welcomed for this.

gencer commented 3 years ago

Could you please test against new branch?:

- gencer/cache#fix-branch-name:
    backend: s3
    key: "v1-node-cache-{{ checksum 'yarn.lock' }}-{{ git.branch }}"
    s3:
        bucket: 'some-bucket-name'
    paths:
        - 'node_modules'

P.S.: Consider this PR as a concept. Please feel free to submit your PR, If you think there is a better idea. Then, I'll close my PR.

Ben435 commented 3 years ago

Based on this SO answer, which leads to this man page, looks like * isn't possible, but seems if you escape ' you could make a branch name with that.

In terms of file path interaction, I think / is the main one we could hit, as \ is disallowed, as is .. and a few other potentially problematic ones.

tldr: your PR looks like it'll fix it, I don't think the more explicit [A-z0-9] fix will be required.

gencer commented 3 years ago

Thanks for confirming!

v2.4.8 Released.