sourcegraph / srclib

srclib is a polyglot code analysis library, built for hackability. It consists of language analysis toolchains (currently for Go and Java, with Python, JavaScript, and Ruby in beta) with a common output format, and a CLI tool for running the analysis.
https://srclib.org
Other
942 stars 62 forks source link

Graphstore #100

Closed samertm closed 9 years ago

samertm commented 9 years ago

This PR adds the package "graphstore", which stores graph data separately from the build cache in formats that facilitate retrieval. Once the graph store has reached featured parity with the build cache, we can merge the two packages.

Graph store directory layout:

                 <defs> := SRCLIBPATH/defs/<def-path>
             <def-path> := <repo>/<unit-type>/<unit>/<path>/<commit-id>
<def-path-no-commit-id> := <repo>/<unit-type>/<unit>/<path>

                 <refs> := SRCLIBPATH/refs/<ref-path>
             <ref-path> := <def-path-no-commit-id>/.refs/<ref-repo>

Refs are currently stored as "all.refs" files, and are indexed by ref-path. We don't save a ref's history or differentiate between commits: when you import refs, old refs for that repository are overwritten.

TODO:

TODO for sourcegraph/sourcegraph:

Unknowns: