Closed eddelbuettel closed 1 year ago
Hi @eddelbuettel, this looks all good to me.
Here are some extra checks with the python package awscrt
.
checksum <- reticulate::import("awscrt.checksums", convert = F)
builtins <- reticulate::import("builtins")
crc32c <- function(string) {
bs <- checksum$crc32c(string)
# convert hex is R string
hex_bs <- builtins$hex(bs) |> as.character()
out <- substr(hex_bs, 3, nchar(hex_bs))
# ensure 8 character long
return(paste0(paste(rep("0", 8-nchar(out)), collapse = ""), out, collapse = ""))
}
strings <- lapply(1:1000, function(x) {
paste(sample(c(letters,LETTERS, "//", "[", "(", "{", "?", "^", "$", " ", "="), size = 20, replace = T), collapse="")
})
result <- vapply(strings, function(string){
identical(crc32c(string), digest::digest(string, "crc32c", serialize = F))
}, FUN.VALUE = logical(1))
all(result)
#> [1] TRUE
Created on 2023-04-30 with reprex v2.0.2
I just added support for vectorized access to crc32c (and blake3) too. This should be ready.
And I think I will keep a mental note of maybe packaging all of the crc32c package for CRAN and then call that from the digest
package (as a Suggests: as I would want to keep the 'no dependencies' rule). That way we would get the already-established platform tests and switches to hardware acceleration where available.
This PR adds support for crc32c by bringing in a cpu-only, non-hardware accelerated version from Google's repo.
@Dyfanjones: I cannot formally add you for a code review but maybe you can "informally" bless this give that you already ran the branch (from one commit ago)....
(The one 'fail' flagged under CI came from codecov where we have defaults 'for acceptable change' that are too stringent. Codecov should really be quiet on PRs so that is on me for keeping the repo setup in 'sane' order.)