Export disks to GCE via bit streaming disk bytes

sa3eed3ed commented 3 years ago

This will require:

Creating a machine in the compromised project, i.e. export machine. It must be possible for the analyst to specify the source image of this machine, in order to allow creating the machine from trusted images project in case constraints/compute.trustedImageProjects is enforced
Start the export machine with a startup script:
- attach disks to export as secondary devices
- verify that the service account attached to the VM has all the required permissions and scope to perform the desired operations
- stream the bits of the attached devices to GCS bucket
- verify the hash of the exported disk in GCS and the source disk hash (calculated while streaming)
- add metadata to the export machine indicating that the disk is exported and the hashes match

This will allow getting a disk image out of the project in case both organization policies constraints/compute.storageResourceUseRestrictions and constraints/compute.trustedImageProjects are enforced and in case OsLogin is allowed only for the organization users while the analyst is an external user with no roles/compute.osLoginExternalUser role.

tomchop commented 3 years ago

Idea here - could there be a way to split this into several modules, so we can have

GCP -> GCS
GCP -> S3
GCP -> Azure Block Storage (ABS)
AWS -> GCS
AWS -> S3
AWS -> ABS
Azure -> ...

Maybe the second module would be just a different "startup script loader", or something along those lines.

sa3eed3ed commented 3 years ago

Ok, I got the main idea, I am yet not sure what is needed to build the second and third module, i.e. to stream data from a GCE instance using service account to S3. What I will do is as follows: I will write the first module and see what can be generalized to use in second module, then do the changes when working on the second. Maybe also more changes in LCF will make it possible to have abstract methods that can be used with each module, I will make some thoughts about that.

we can keep this issue open for all these combinations but I will scope my PR to GCE -> GCS

tomchop commented 3 years ago

we can keep this issue open for all these combinations but I will scope my PR to GCE -> GCS

Yes, that sounds perfect. We can also create a new feature branch here, and so the smaller / iterative PRs on that one as we need to. When everything is ready, we do a big merge in to main.

sa3eed3ed commented 3 years ago

The GCP->GCS feature depends on https://github.com/google/cloud-forensics-utils/pull/412 The changes in dftimewolf will not be big and fits in one PR since most of the logic will be in LCF

sa3eed3ed commented 2 years ago

updates: changes in libcloudforensics were done, I will add the missing feature to stream disks out of GCP in Q1/2022, this is WIP

log2timeline / dftimewolf

Export disks to GCE via bit streaming disk bytes #440