rustfoundation / painter

Library and tools for managing ecosystem wide call graphs and llvm-ir analysis
Other
154 stars 5 forks source link

Painter

Painter is an implementation of methodologies to build a graph database of dependencies and invocations between all crates within the crates.io ecosystem.

serde

Index Importing

The index is imported leveraging the crates-index crate. We iterate all version of all crates in the index and their dependencies, building out the crate and version nodes and relationships within the database. Alternative methods are available in the POC directory where imports were done directly from the csv dump of the crates.io database. Regardless, this is meant to build out an initial graph representation of the entire crates.io ecosystem and mapping out all dependencies of all versions in all history.

Alternatives

The index can also be imported from the crates.io SQL database dump with a set of scripts in the graph-imports directory.

Bytecode Analysis

We then determine the call graph of every crate by these steps:

  1. Build the crate with default features and flags with --emit=llvm-bc
  2. We then analyze the bytecode of every successfully compiled crate, extracting all invoke instruction variants in the LLVM bitcode. This renders a complete list of all (function)->(function) invocations within a given crate. We also demangle these and then filter out various cases such as runtimes, the std and core libraries, and other cases.
  3. This list of (function)->(function) relationships is then imported into the database, represented as a given (Version) node of a crate [:INVOKES] a specific (Crate) node. In the future, we hope to be able to group or narrow versions of crates being invoked but this has not been implemented.

Database Representation

Nodes:

Relationships:

Current Limitations

Future Work

Quick Start

See BUILDING.md for build specific instructions.

NOTE: Requires rustc 1.60 for building the crate ecosystem and painter itself requires nightly; this is needed due to matching the LLVM IR version with the currently supported LLVM version of llvm-sys and llvm-ir. This will be updated as work to integrate newer LLVM versions is done.

Start the docker neo4j instance

A docker-compose is available for a default Neo4j instance that can be used for testing and local use. Note or change the default testing username/password.

Populating the crate index database

The first step is populating your neo4j database with the up-to-date crate index. This is pulled from the live crate index and populates the appropriate nodes and relationships. Crates, versions and dependency relationships are populated at this step.

Run the analysis

This will populate the database with all invocation relationships. These exist as a representation of a given version calling a given crate. We cannot definitively say what version of what crate is invoked, so the node relationshp exists at (Version)-[INVOKES]->(Crate)

Current crates.io graph snapshot: Coming Soon

Nodes:

Relationships:

Code of Conduct

The Rust Foundation has adopted a Code of Conduct that we expect project participants to adhere to. Please read the full text so that you can understand what actions will and will not be tolerated.

Contributing

See CONTRIBUTING.md.

Licenses

Painter is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with documentation portions covered by the Creative Commons Attribution 4.0 International license..

See LICENSE-APACHE, LICENSE-MIT, LICENSE-documentation, and COPYRIGHT for details.

You can also read more under the Foundation's intellectual property policy.

Other Policies

You can read about other Rust Fondation policies in the footer of the Foundation website.