Closed UserNotFound closed 2 years ago
I left a small assortment of review comments, but the thing that jumped out to me the most is that it SEEMS like the only dates ever used are in UTC (or GMT/UTC+0), they are always in the same format and/or same length.
i echoed the below in this comment as well
we do not need to employ any time comparisons but can just use string sorting/string comparisons (i think the real word for it is lexicographical sorting of dates in X format?) without any date conversions if this is the case. date conversions not only add additional places things can break (parsing/comparisons/lots of logic) but also compute time (though it's fast). it's one of those things that also makes tests simpler/easier to read, but glancing through the unit tests for the s3 files and various data, i only saw one type/zone/length of date (perhaps i was mistaken). this only applies IF we use the same and consistent date format (which I think we do?)
if i'm wrong on the above guess about how the dates are used, then this comment is inaccurate and can be disregarded
edit - just left a comment clarifying that we could use string comparisons (dont think we really use sorts here)
Addressing all feedback, plus adding validation that you provide a full 64 char container ID if you're using --container-id.
@UserNotFound : what's the plan for when the API does return encryption keys? At that point, we should update the CLI to support finding resources (a) by handle, not ID, and (b) without passing an encryption key. It is more ergonomic/easier and more consistent with the rest of the CLI.
I see why we also need the current semantics, but I do not think this should or will be the default use case.
Do you think we'd have a separate command, or just more argument options for this command?
@UserNotFound : another minor point: I was mildly surprised by the behavior of downloading nothing if --download-location
is unspecified. (I was imagining it would download to a folder in the current directory OR /tmp.) This is very minor, and I think it's fine as is, especially given the clear output you provide:
No files were downloaded. Please provide a location with --download-location to download the files.
Another thing:
[--container-id=CONTAINER_ID] # The full (64 char) container ID to download logs for
How will customers have this? Don't we only send them the 12-character prefix?
Apart from these comments, this looks awesome overall! Very excited to have this. Great work on the implementation.
Thanks for the feedback @fancyremarker. Overall, this was written in a vacuum of hard requirements with an eye towards showing basic functionality and showing what is possible.
For example, there's no specific command to download logs of an ephemeral session. When I started, there was no way for a customer to know the Operation ID of the session, which is needed to identify the files. Now that the Operation ID's are show in the UI, it would be possible to do so. Should that be included now, or in a future CLI update?
[--container-id=CONTAINER_ID] # The full (64 char) container ID to download logs for
How will customers have this? Don't we only send them the 12-character prefix?
Good question - i suppose it would make sense to match the user-provided value with the beginning of of the container ID. I suppose the first 12 would be "close enough" to uniquely identify a container.
@UserNotFound : another minor point: I was mildly surprised by the behavior of downloading nothing if --download-location is unspecified. (I was imagining it would download to a folder in the current directory OR /tmp.) This is very minor, and I think it's fine as is, especially given the clear output you provide:
No files were downloaded. Please provide a location with --download-location to download the files.
This was simply an expression of my opinion that we can't assume any specific location is safe for PHI or other secrets, and it prevents a footgun where the user is not aware of where they are downloaded to. I could go either way on it.
@UserNotFound : what's the plan for when the API does return encryption keys? At that point, we should update the CLI to support finding resources (a) by handle, not ID, and (b) without passing an encryption key. It is more ergonomic/easier and more consistent with the rest of the CLI.
I see why we also need the current semantics, but I do not think this should or will be the default use case.
Do you think we'd have a separate command, or just more argument options for this command?
My thoughts were branch the logic, if you have a valid token:
--stack
remains required--encryption-keys
, --region
, and --bucket
options become optional, defaulting to the values retrieved from our API--app-handle
are un-gated, since they can do the API lookup to get the ID needed to retrieve from S3Ultimately, though this brings up an interesting concern around access. While searching by handle is "easier" and more in line with other CLI commands, "authorization" is more complex:
read
permission/role for the current user"Enclave Owner" might be a good role, so the TL;DR of what an owner can access is "everything on aptible" and "everything your aws credentials have access to".
This was simply an expression of my opinion that we can't assume any specific location is safe for PHI or other secrets, and it prevents a footgun where the user is not aware of where they are downloaded to. I could go either way on it.
I think your rationale (forcing opt-in to downloading possible PHI locally) makes sense, and I'm happy to leave as is. The warning in this case is very clear.
"Enclave Owner" might be a good role, so the TL;DR of what an owner can access is "everything on aptible" and "everything your aws credentials have access to".
Yes, this is what I had in mind as well. As discussed, that does not need to happen in the scope of this PR. This appears complete for the current stage of release.
README needs regenerated.
Download logs from S3, by providing an app, database, proxy, or container ID. Since the S3 Archive feature is meant as a compliance feature, this tool for downloading logs differs from the UX of other commands:
Here's a function we can use for pulling logs ourselves, and also facilitates easier testing.
Now, use the string match option to list all files matching the v3 log file name specification:
stack-logs --string-match v3
From there, you should have an idea of how to use the other options to find exactly what you're looking for.
My thoughts:
pancake stack:logs
, so that we dogfood it for customers