eddelbuettel / digest

R package to create compact hash digests of R objects
https://eddelbuettel.github.io/digest
111 stars 47 forks source link

[Question] What methods are cross-platform #176

Closed dipterix closed 1 year ago

dipterix commented 2 years ago

Sorry if I asked a dumb question. From all methods supported, is there a list of methods that are cross-platform reproducible?

The conditions include:

For example, can expect digest("aaa", serialize=FALSE, algo="xxhash64") to produces the EXACT same results on all OS, endianess, and all CPUs?

eddelbuettel commented 2 years ago

In general, there are two aspect there:

Look for example at the package unit tests which check against invariant test output. Here is one for md5:

## Standard RFC 1321 test vectors
md5Input <-
    c("",
      "a",
      "abc",
      "message digest",
      "abcdefghijklmnopqrstuvwxyz",
      "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789",
      paste("12345678901234567890123456789012345678901234567890123456789012",
            "345678901234567890", sep=""))
md5Output <-
    c("d41d8cd98f00b204e9800998ecf8427e",
      "0cc175b9c0f1b6a831c399e269772661",
      "900150983cd24fb0d6963f7d28e17f72",
      "f96b697d7cb7938d525a2f31aaf161d0",
      "c3fcd3d76192e4007dfb496cca67e13b",
      "d174ab98d277d9f5a5611c2c9f419d9f",
      "57edf4a22be3c955ac49da2e2107b67a")

for (i in seq(along.with=md5Input)) {
    md5 <- digest(md5Input[i], serialize=FALSE)
    expect_true(identical(md5, md5Output[i]))
    #cat(md5, "\n")
}

md5 <- getVDigest()
expect_identical(md5(md5Input, serialize = FALSE), md5Output)

We run this test on every platform the package is checked. And we have similar checks for the other methods.

As your question was very specifically about xxhash64 I would encourage you to see what its upstream repo has to say about the matter. We just use it here as one among a number of hashing functions.

dipterix commented 2 years ago

Thanks for answering my question.

Is the hash digest the same? In general it should be.

Is this the case for sha-256?

Thanks

eddelbuettel commented 2 years ago

Well everything I said in the previous applies here too so I do not understand what you are asking now.

Also, as you know, thhe package is open source and there are sha256 unit tests.