brainstorm: package refactoring

I find the current package names and layout confusing, even after working with them for a while. I'm also not always sure where to put stuff. Here is a draft proposal for how to re-organize things, particularly coming out of the recent lex refactor.

Would be good to push these through before too many folks start building on this repo.

Could do this in stages, or rip the bandaid all at once. Timing-wise, would like to get labelmaker landed before doing any of the more disruptive refactors.

ATP Services

bgs/: "big graph server"; command name is bigsky
pds/: "personal data server"; command name is laputa
labeler/: "labeling service"; command name is labelmaker

Service Components

These generally mange state, either on-disk or in a SQL database. Sometimes these align with "ecosystem roles" (a "service" might fulfill multiple "roles").

carstore/: on-disk repo store (in CAR files), plus SQL database indexing
blobstore/: on-disk file storage, plus SQL database indexing
eventmgr/: handles event subscription from producer side, including persisting the stream (SQL database or other backens) and re-play buffer
repomgr/: integrates a storage engine, key management, and event generation
crawlmgr/: rename of indexer/, though we might be able to split that functionality into eventmgr/ (for single consumption, like gosky, labelmaker, and search service) and bgs/ (for crawling multiple endpoints). might be simplest to keep a single implementation which works with multiple upstream endpoints
aggrmgr/: for parts of indexer related to persisting notifications, backlinks, etc
didmgr/ (NEW): persisted cache of DID identifies and keys, including both local accounts and remote identities. supersedes previous keymgr code
labelmgr/ (NEW): not needed to start, but might end up existing if PDS or other services need to persist and access labels from a SQL database

atproto libraries

These are specific to atproto and might be reused by third parties.

atproto/identifiers/ (NEW): string wrapper types for DID, NSID, at-uri, handles, cid-str (as a string, not parsed) and other Lexicon-defined types. only string validation, not external code/helpers (eg, no DID resolution stuff). many other packages would import this
atproto/did/ (NEW): replaces whyrusleeping/go-did. parse and represent only the DIDs supported by atproto (did:plc and did:web). no crypto! parse DID docs. possibly clients for doing did:web and DNS lookups. not a general-purpose DID library.
atproto/crypto/ (NEW): replaces whyrusleeping/go-did. parses/generates all the supported atproto key types in all the various formats (did:key, multibase-in-did-doc, hex for signing keys, etc). only works with the curves and formats used in atproto. clear naming for, eg, "HashAndSign" (which does SHA-256 then signs; instead of just "Sign" or "Verify" names)
atproto/xrpc/: generic XRPC client, and possibly some server-side helpers. may also include some subscription (websocket) client and helper code
atproto/repo/ and atproto/repo/mst/: low-level types and algorithms for working with repo DAG structure. agnostic to storage details.

lexicon packages

This is kind of bike-sheddy, and i'm not really certain this is the best way to go. But the current api/atproto/ setup, resulting in package name atproto by default, feels pretty confusing to me.

It really feels like the lexutil stuff (LexBlob etc) should live closer to the actual generated lexicons.

lexicon/comatproto/: the current api/atproto/. types for com.atproto.* lexicons
lexicon/appbsky/: the current api/bsky/. types for app.bsky.* lexicons
lexicon/lexutil/: the current lex/util.
lexicon/gen: implementation code for lexgen

Other Packages

fakedata/: new package with most of fakermaker code. should be in a non-main package for easier re-use in integration tests
plc/: did:plc API client; types for encoding and verifying PLC operations; fake/mock/testing server implementation. could go under atproto/did/plc/?
dbmodels/: only for database models that are actual shared across services (or service components)
cmd/: actual binaries, CLI handling, etc
testing/: any inter-package or high-level tests. open to renaming this (tests/?)
util/: grab bag stuff that still doesn't fit elsewhere
cborgen/ or /gen/cbor/: clearer name for current gen/. alternatively, use the go:generate functionality in golang to do this per-package instead of top-level

Other Changes

move cmd/gosky/util/ to util/cmdutil/
move version/ to util/version/
move util/dbcid.go and util/uid.go to dbmodels/
move util/time.go to lexutil. or maybe copy? and possibly rename to, eg, LexDatetime
move util/fakekey.go to didmgr (or wherever keymgr ends up)
delete testscripts/, or at least move under testing/

bluesky-social / indigo