open-sauced / pizza

This is an engine that sources git commits and turns them to insights
Apache License 2.0
31 stars 13 forks source link

Implements LRU cache for git repos on disk #17

Closed jpmcb closed 1 year ago

jpmcb commented 1 year ago

LRU Cache

The GitRepoLRUCache struct and associated methods is a "Least Recently Used" cache whose main element is the GitRepoFilePath struct (which in itself represents an on-disk git repository).

This LRU cache differs from a typical LRU cache in a few ways:

Git provider wrappers

The GitRepoProvider is an interface that provides a flat API surface for performing operations with many different git providers (i.e., with the LRUCacheGitRepoProvider or the InMemoryGitRepoProvider). The FetchRepo method on this interface returns a GitRepo, another interface that wraps a go-git repository (but also provides the necessary API surface area for different git repo providers to do the necessary cleanup / prep for those repos).

Additional changes

What type of PR is this? (check all applicable)

Related Tickets & Documents

Closes #14

Mobile & Desktop Screenshots/Recordings

N/a

Added tests?

Added to documentation?

[optional] Are there any post-deployment tasks we need to perform?

N/a

[optional] What gif best describes this PR or how it makes you feel?

^ When you evict elements from the cache

brandonroberts commented 1 year ago

Easy to read and well documented and tested. I only had minor documentation nits but not worth holding it up.

Just for reference, are there any limitations on the number of repos that can be indexed on disk concurrently?

jpmcb commented 1 year ago

Just for reference, are there any limitations on the number of repos that can be indexed on disk concurrently?

Good question: short answer, with the caching approach, no, there's no limit to the number of repos that can be indexed concurrently.

The only constraints are:

Otherwise, the server's route handling will spin up a new thread for any request that comes in.

jpmcb commented 1 year ago

I only have questions about whether we should create issues and link them next to the TODOS.

My plan was to merge this and then create issues around those TODOs: I didn't want to create issues for code that hadn't merged yet πŸ˜…

jpmcb commented 1 year ago

Merged. Planning to back-fill issues for hanging todos so we can keep tracking them.