Next steps discussion - Githubissues

eXtremeX commented 4 years ago

Hi everyone, we'd like to start a discussion regarding the next steps:

1. Handling JFrog's Permissions

Our guess is that we don't want these permissions as entities but more to just use them to build the relationships between other entities (e.g. users/groups and repositories).

One idea is that we do the following: User -> HAS -> AccessPolicy -> MANAGES -> Repository Group -> HAS -> AccessPolicy -> MANAGES -> Repository (permissions can be applied to groups as well).

Repository seems not to have an owner (outside the fact that the main Account is owning everything), so these permissions can be used as the glue between these entities.

Also, JFrog's permissions have 5 fields (settings like - read, annotate, deploy, delete, manage Xray). We've seen that AccessPolicy has rules array and are wondering if we should use it for this?

Also, maybe the JupiterOne's AccessPolicy is total wrong choice altogether, so that's a question as well - if we should even use that class or something else?

2. JupiterOne's AccessKey Class

Should we perhaps be using this to manage access tokens? We haven't used this class in the past, but we've noticed that it exists.

3. Handling JFrog's Artifacts

One idea is the following: Repository -> HAS -> Artifacts.

For artifact class, the closest one we've noticed seems to be the DataObject. Should we use that or is there a better choice?

4. Handling JFrog's Builds

One idea is the following: Build -> PRODUCES -> Artifacts.

For build class, maybe Task?

5. Any other high-value entities/relationships

And of course, if you can suggest some other high-value entity/relationships, please do!

erichs commented 4 years ago

Hi! Re: # 3, handling artifacts, DataObject might make sense as a generic class, IFF the artifact is not obviously an Image or a CodeModule.

erichs commented 4 years ago

Re: # 4, builds, a Process is probably the high-level class to associate with JFrog Builds.

erichs commented 4 years ago

I think Repositories, Builds and Artifacts would be a great start, and likely to cover over most of the initial security analyst use-cases with this graph data.

aiwilliams commented 4 years ago

1. Handling JFrog's Permissions

Our guess is that we don't want these permissions as entities but more to just use them to build the relationships between other entities (e.g. users/groups and repositories).

I've attempted to provide some guidance on IAM ingestion as a result of your question, which is one we just talked about extensively at the office in the context of Azure's permissions model. However, I think it needs improvements as I study the JFrog permissions model.

JFrog Identity and Access docs indicate that permissions are named (Permission Target Name) groupings of users, groups, resources, and permission statements (Read, Annotate, etc.) I think the Permission Target should generate two entities of class AccessPolicy, one with _key: "groups:<permission target id>", principalType: "group", the other _key: "users:<permission target id>", principalType: "users". This is necessary because the permissions granted to the users is maintained apart from the permissions granted to the groups.

JFrog Permissions Graph

The raw data of the AccessPolicy should include the complete policy data related to the principalType. This raw data would then be used in a later step, after resources are ingested, to build relationships between the policy and resources it grants access to.

Also, JFrog's permissions have 5 fields (settings like - read, annotate, deploy, delete, manage Xray). We've seen that AccessPolicy has rules array and are wondering if we should use it for this?

The permissions make up the relationships between the AccessPolicy and the resources referenced in the permission. The settings fields should be transferred to the relationship as properties such as read: boolean, annotate: boolean, .... Additionally, we have found it beneficial to provide a summary of the permissions as a single property; the rules array works well for this purpose (i.e. rules: ["read", "annotate", "deploy", ...]). We may also discover that we need to consider these properties when generating a _key for the relationship, to ensure the relationships are unique, but we'll wait to see if this is necessary as we learn more.

The simplest form of these relationships can be _type: "artifactory_iam_policy_permission", _class: "ALLOWS", _key:${policyEntity._key}|allows|${resourceEntity._key}`. However, the nature of the grant appears to allow for matching:

ANY resource of specified type
Include pattern
Exclude pattern
A list of specific resources

Therefore, a much more complex form of the relationships will eventually need to be supported. In the case of AWS, a grant (policy statement) can refer to a resource directly or through similar pattern matching. Since we do not ingest S3 objects, for example, a grant to a specific object becomes a relationship to the bucket. I suspect that in the Artifactory integration, we may need to consider similar concerns.

2. JupiterOne's AccessKey Class

The API tokens should be _type: "artifactory_api_token", _class: ["Key", "AccessKey"].

3. Handling JFrog's Artifacts

I don't yet quite understand what exactly an Artifact is, it being such an abstract word 😆 Is every version of a binary an Artifact? Is the package sans version an Artifact? The precise use of "Artifact" will have an impact on the decision of what class to use.

I see that it is possible to find Repositories by pkg:<package type>, such as finding a Docker repository with pkg:docker. This suggests that all Artifacts in a Docker Repository will be _type: "artifactory_docker_artifact", _class: ["Image"] (as @erichs points out). For an NPM Repository, the Artifacts would be _type: "artifactory_npm_artifact", _class: ["CodeModule"]. There would be an entity for lodash related to the Repository entity, not an entity for every version of lodash. This would support AccessPolicy - ALLOWS -> CodeModule.

I don't think we've ever used DataObject because we don't ingest many (any?) data objects, such as files or S3 objects. I think we should skip versioned binaries, until we see how having all 700 versions of a package reflected in the graph will help in a meaningful way. When we get around to JFrog XRay, so that we have Findings of vulnerabilities on specific versions, I think we'd still consider pointing at the Artifacts (packages?), Builds, and Release Bundles with version and other details on the Finding.

4. Handling JFrog's Builds

If I understand correctly, a Build would be _type: "artifactory_build", _class: ["Configuration"], since it is a configuration of a build process. Were we to ingest every execution of a Build - which I don't think we should do at this point - then perhaps those would be Process.

5. Any other high-value entities/relationships

As @erichs points out, we're interested in data that will serve a security analyst. We typically start with identity and access review workflows in mind. This means starting with ingestion of users, groups, permissions, and protected resources.

Working from the JFrog Identity and Access docs, it looks like we need to have these resources ingested so an analyst can review access:

Repositories
Builds
Release Bundles
Destinations
Pipeline Sources

cc @erkangz @austinkelleher @ndowmon

eXtremeX commented 4 years ago

Thanks a lot for the amazing replies @erichs @aiwilliams !

We've gone through them all, multiple times :smiley:, and will begin implementing all of what was said. I will keep this reply short as I'm sure we'll have more to write once we make more progress on these 1-5 points. We can also leave this issue open throughout this integration development as a whole.

I don't yet quite understand what exactly an Artifact is, it being such an abstract word :laughing:. Is every version of a binary an Artifact? Is the package sans version an Artifact?

Haha! Well, here's what we know so far: 1) You can upload anything as an artifact. (it seems that way). 2) Generally, that means the file you upload is either binary blob (ie png) or text-based (ie JSON). If you upload a text-based file, Artifactory will still treat it as a binary - no content diffs, no content preview, just size in bytes, checksum, etc. 3) We tried uploading 2 different images that were named the same name and had the same extension to the same repo and what happened is that the 2nd upload simply overwrote the first upload (because they had the same name + extension). So it seems that there isn't any versioning underneath the surface, but that Artifactory simply versions artifacts using their names. It is end-user's responsibility to include the version in the file's (artifact) name for that to work properly. note: artifacts do have metadata associated with them so it might be possible to introduce a custom version field there and use it, but that at least doesn't seem to happen by default.

So the answers to your questions (with our current knowledge) would be:

Is every version of a binary an Artifact?

Yes, but so long one actually uploads differently named binary files (artifact-v1.png, artifact-v2.png).

Is the package sans version an Artifact?

As soon as you upload something as an artifact, it "just works". Looking at dashboard there doesn't seem to be some implied version, but because the package you uploaded probably had a name, that seems enough so the answer would be yes here too.

P.S We'll continue looking into Artifacts and will update this thread with new findings (if any), just wanted to try to answer your questions as soon as possible!

JupiterOne-Archives / graph-artifactory

Next steps discussion #3

1. Handling JFrog's Permissions

2. JupiterOne's AccessKey Class

3. Handling JFrog's Artifacts

4. Handling JFrog's Builds

5. Any other high-value entities/relationships

1. Handling JFrog's Permissions

2. JupiterOne's AccessKey Class

3. Handling JFrog's Artifacts

4. Handling JFrog's Builds

5. Any other high-value entities/relationships