j3t / jenkins-pipeline-cache-plugin

A cloud native file cache for Jenkins build pipelines which uses an S3-Bucket as storage provider.
MIT License
15 stars 2 forks source link

hashFiles silently generating empty hash #36

Open rbtcollins opened 1 year ago

rbtcollins commented 1 year ago

I'm not sure whats going on yet, but I had inappropriate cache hits, and tracked it down to hashFiles giving the same signature when file content had changed.

Concretely, I write a number of strings to a file:

String hashStrings(List strings) {
    // need a java extension perhaps - the sandbox is blocking the
    // obvious .digest('SHA-1') method.
    // Manual salt to allow rekeying if needed
    input = ["serial-1"] + strings
    temp_path = pwd(tmp: true)
    hash_path = "${temp_path}/cargo-cache-key-vars"
    writeFile(file: hash_path, text: input.join('\n'))
    echo "hashing ${input.join('\n')}"
    hash = sh(script: "md5sum < ${hash_path}", returnStdout: true).split(' ')[0]
    hash_files_hash = hashFiles(hash_path)
    echo "${hash} ${hash_files_hash}"
    return hash
}

Which echos e973e42cfa03aa8c5a826ecffb41431a d41d8cd98f00b204e9800998ecf8427e

Now that hashFiles hash is a well known hash - a completely empty hash:

echo -n | md5sum
d41d8cd98f00b204e9800998ecf8427e

This code Seems fine IFF it is choosing the Path override.

So I'm not sure. Perhaps something to do with the k8s plugin :/.

j3t commented 1 year ago

What is the specific problem? Is there something I can do? Otherwise I would like close this ticket.

rbtcollins commented 1 year ago

hashFiles(path_to_existing_file) should either:

I'm confused as to why its not returning the has of that one file, given I'm using hashFiles happily elsewhere. Possibly something to do with the pwd(tmp:true) use?

If you want to close it - sure. My workaround is to not use hashFiles :- but others in future may be equally confused.

j3t commented 1 year ago

Keeping this in mind I would argue it is fine like it is, but I agree that this could be confusing.

Proposal:

rbtcollins commented 1 year ago

I agree that it should be possible to hash an empty set (e.g. we use that feature to hash both rust-toolchain and rust-toolchain.toml as two separate patterns. I think a warning would be great, since it would make it visible.

j3t commented 1 year ago

I thought about it and I guess it is more clear when hashFiles returns an empty (just an empty string, no md5 hash) in case no files were found or the content is empty.