pbiecek / archivist

A set of tools for datasets and plots archiving
http://pbiecek.github.io/archivist/
74 stars 9 forks source link

Use archivist.hash.function to select desired hash algorithm #321

Closed zzawadz closed 6 years ago

zzawadz commented 6 years ago

I was wondering how to add the possibility to select other hashing algorithm without affecting the package structure.

In this pull request there's my proposition - just create simple wrapper on the digest, and use the environmental variable archivist.hash.function to specify desired algorithm.

adigest <- function(x) digest(x, algo = getOption("archivist.hash.function", default = "md5"))

What do you think about this solution?

pbiecek commented 6 years ago

Thanks, very interesting and elegant solution.

Is the archivist.hash.function part of the .ArchivistEnv environment? This is the place where we keep archivist options/defaults. (see https://github.com/pbiecek/archivist/blob/master/R/zzz.R)

Also it would be nice to have somewhere an example of how to set other hashing function. Maybe as an example to the archivist::aoptions()?

zzawadz commented 6 years ago

Is the archivist.hash.function part of the .ArchivistEnv environment?

Not now, but I'll move it there.

Also it would be nice to have somewhere an example of how to set other hashing function. Maybe as an example to the archivist::aoptions()?

Definitely:)

pbiecek commented 6 years ago

Great, I've heard that some people are complaining about md5 so this will help a lot. I just wonder if the hash function shall be stored in the metadata for an artifact.

That being said, right now the hash is used as a primary key in the archivist meta-data database. The length of hash for "sha512" is 4x longer than for "md5" (128 chars vs 32 chars) I need to check if this will work smoothly with SQLite and PostgreSQL backends

zzawadz commented 6 years ago

I added one simple test and moved the option to archivist aoptions. I'll try to find some time to explore the package and write more tests for different hasing functions.

pbiecek commented 6 years ago

Thanks, very valuable commit. Fixes #323