r-lib / sodium

R bindings to libsodium
https://docs.ropensci.org/sodium
Other
69 stars 11 forks source link

Method to encrypt a single field #5

Open aetiologicCanada opened 7 years ago

aetiologicCanada commented 7 years ago

Often one needs to encrypt one or more fields from a tibble/dataframe/data table with a function which returns the cypher to the appropriate row of the table, and then one would drop the plaintext. eg.

iris_confidential <- iris %?>% mutate( encrypted_species = magic_function(species, ...) ) %>% select (-species)

I have been unable to coax sodium into doing a single-column encryption, or more precisely when I do I get an object back that has more observations than the data frame has rows. One can't then push the resulting vector back into the data_frame. If there a way to coax magic_function() out of sodium?

See:

library(sodium) key <- hash(charToRaw("This is a secret passphrase")) msg <- serialize(iris$Species, NULL) str(msg) raw [1:748] 58 0a 00 00 ... nonce <- random(24) cipher <- data_encrypt(msg, key, nonce)

str(cipher) atomic [1:764] 55 31 fa 90 ...

  • attr(*, "nonce")= raw [1:24] 03 8e 9b 37 ...

You can't stuff either cipher or msg back into the data frame because the row-numbers don't align.
jeroen commented 7 years ago

You could store them in a list column like this:

# Example data
key <- hash(charToRaw("This is a secret passphrase"))
mydata <- iris
mydata$cipher <- lapply(mydata$Species, function(x) {
  sodium::data_encrypt(serialize(as.character(x), NULL), key = key)
})

vapply(mydata$cipher, function(cipher){
  unserialize(sodium::data_decrypt(cipher, key = key))
}, "character")
jeroen commented 7 years ago

Alternatively you could convert each raw value to a string using e.g. sodium::bin2hex or jsonlite::base64_enc.

aetiologicCanada commented 7 years ago

Thanks Jeroen,

I will have to ponder further as the resulting field needs to be sorted-able for dplyr::group_by() and dplyr::arrange(), and available as an SQL merge key. Thanks for the ideas.

On Tue, 6 Jun 2017 at 13:08 Jeroen Ooms notifications@github.com wrote:

Alternatively you could convert each raw value to a string using e.g. sodium::bin2hex or jsonlite::base64_enc.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeroen/sodium/issues/5#issuecomment-306601931, or mute the thread https://github.com/notifications/unsubscribe-auth/AIPm4fPqYS6mjrMggNmqpEA4H577LTw8ks5sBbGsgaJpZM4Nx0f1 .

jeroen commented 7 years ago

In that case you probably want to map the raw vectors to strings with sodium::bin2hex.