nelsonjchen / gargantuan-takeout-rocket

🚀 Backup Google Takeout archives (YouTube channel and Google Photos) at 1GB/s+ to Azure Storage periodically with minimal human toil and financial cost
140 stars 5 forks source link

Encryption #3

Open gunar opened 2 years ago

gunar commented 2 years ago

My manual process used to include piping curl into gpg --encrypt before piping it to aws s3 cp -. Do you think it might be interesting and feasible to implement a similar thing for GTR? Given that the data flows through CloudFlare worker, perhaps we could do perform the encryption inside the worker.

What do you think?

nelsonjchen commented 2 years ago

Too much CPU. The cloudflare worker is just shuffling bytes between sockets which is almost CPU-less. BW is cheap for Cloudflare, but big encryption/decryption outside of the transparent TLS/SSL stuff is out of the question.

In Azure, the best I think that can be done is using customer provided keys or the target storage's encryption stuff. Unfortunately, you'll need to give the cloud provider a key which they'll hold for the duration of the encryption/decryption process and the cloud provider will presumably destroy their copy of it when the operation is done.

Also, never tried this with multi-part blobs or whatever either. But I would assume they work too.

nelsonjchen commented 2 years ago

This is a better link:

https://docs.microsoft.com/en-us/azure/storage/blobs/encryption-customer-provided-keys

It's symmetric encryption though.

The key is securely discarded as soon as the encryption or decryption process is complete.

gunar commented 2 years ago

Yeah thanks but symmetric is not worth the effort. They'd have the key anyway.

nelsonjchen commented 2 years ago

They pinky promise to not store the key for longer than necessary. Once the post completes, my guess is they wipe the key from all memory. I would imagine if they kept the key for any longer than necessary, that might break some certifications or validations of some sort.

It's still a "promise" though.

nelsonjchen commented 2 years ago

I think I'll try to look into this and see what can be done. The implementation should still deter most adversaries.

m1ndy commented 1 year ago

+1 that I would be a user that would trust their pinky promise enough to use this symmetric encryption feature.

What I would really like though, is to use rclone's crypt functionality to encrypt the files. Here's a discussion of that being done on a cheap GCP VM at over 6GiB/s (48gbps): https://forum.rclone.org/t/best-way-to-maximize-100gbps-gcp-gvnic/32832/12

nelsonjchen commented 1 year ago

Couldn't look at it this cycle. Fighting fire in #8 . Hopefully next cycle 😄

nelsonjchen commented 11 months ago

I haven't had time to pay attention to this. I'll take a PR if anyone wants to write up instructions post-transfer to encrypt the archives inside Azure's garden after they've been transferred to "plain text" and then delete the plain text afterwards, minimizing the window. Theoretically, it could be done with rclone on an Azure box doing the same rough thing as GCP. Though, I've heard Azure's networking is certainly not as legendary as GCP's fabric but I'm sure you could get comparable speeds.