starlinglab / integrity-v2

Monorepo for the next iteration of Starling Lab's integrity pipeline.
MIT License
0 stars 3 forks source link

Encrypted file workflow #22

Closed makew0rld closed 3 months ago

makew0rld commented 3 months ago

@benhylau

  1. A file is uploaded to the webhook, and stored unencrypted on the disk. AA is populated with its metadata.
  2. (Optional) The user runs genkey file and an encryption key is stored in the key folder with the name <cid>.key
  3. The user runs encrypt bafy1..., encrypting the file with a new random key. The encrypted file is stored under its CID (bafy2...) in the encrypted files folder.
  4. The key is stored under bafy2....key.
  5. AA now has a relationship stored between the unencrypted and encrypted assets. The encrypted asset has additional metadata like encryption_type: secretstream.
  6. The user runs upload drive:/foo bafy2.... The encrypted file is uploaded to Google Drive. That upload is logged to AA under the encrypted file's CID, under the attribute uploads.
makew0rld commented 3 months ago

Decryption

  1. Verifier person gets the file from Google Drive
  2. They can assume the file is encrypted with secretstream, or use the attr CLI tool against our AA instance to verify the encryption type by looking at the encryption_type attribute.
  3. They must contact Starling Lab out of band to receive the key. Possibly it was already sent to them, for example included in that private Google Drive folder.
  4. They run decrypt -i bafy2... -k bafy2....key -o my_clear_file and the decrypted file is now available.
  5. Due to secretstream, they can be sure the encrypted file bytes were not modified or truncated, and the decrypted output is the whole original file unmodified.
benhylau commented 3 months ago

Do they get a final step to verify against bafy1...?

makew0rld commented 3 months ago

Good point. I was assuming they wouldn't have access to bafy1... as it is secret. But even without access, they can perform a final verification step by creating a CID for their decrypted file, and comparing that against the CID of the original plaintext file, bafy1.... This CID will be available to them in AA, for example listed as a parent relationship of bafy2....

Due to the secretstream encryption algorithm, there should be no scenario where the decrypted file does not match the original. However checking the hash is still important, in case Starling Lab accidentally sent the wrong encrypted file, or the verifier accidentally decrypted the wrong one.