I find the current package names and layout confusing, even after working with them for a while. I'm also not always sure where to put stuff. Here is a draft proposal for how to re-organize things, particularly coming out of the recent lex refactor.
Would be good to push these through before too many folks start building on this repo.
Could do this in stages, or rip the bandaid all at once. Timing-wise, would like to get labelmaker landed before doing any of the more disruptive refactors.
ATP Services
bgs/: "big graph server"; command name is bigsky
pds/: "personal data server"; command name is laputa
labeler/: "labeling service"; command name is labelmaker
Service Components
These generally mange state, either on-disk or in a SQL database. Sometimes these align with "ecosystem roles" (a "service" might fulfill multiple "roles").
carstore/: on-disk repo store (in CAR files), plus SQL database indexing
blobstore/: on-disk file storage, plus SQL database indexing
eventmgr/: handles event subscription from producer side, including persisting the stream (SQL database or other backens) and re-play buffer
repomgr/: integrates a storage engine, key management, and event generation
crawlmgr/: rename of indexer/, though we might be able to split that functionality into eventmgr/ (for single consumption, like gosky, labelmaker, and search service) and bgs/ (for crawling multiple endpoints). might be simplest to keep a single implementation which works with multiple upstream endpoints
aggrmgr/: for parts of indexer related to persisting notifications, backlinks, etc
didmgr/ (NEW): persisted cache of DID identifies and keys, including both local accounts and remote identities. supersedes previous keymgr code
labelmgr/ (NEW): not needed to start, but might end up existing if PDS or other services need to persist and access labels from a SQL database
atproto libraries
These are specific to atproto and might be reused by third parties.
atproto/identifiers/ (NEW): string wrapper types for DID, NSID, at-uri, handles, cid-str (as a string, not parsed) and other Lexicon-defined types. only string validation, not external code/helpers (eg, no DID resolution stuff). many other packages would import this
atproto/did/ (NEW): replaces whyrusleeping/go-did. parse and represent only the DIDs supported by atproto (did:plc and did:web). no crypto! parse DID docs. possibly clients for doing did:web and DNS lookups. not a general-purpose DID library.
atproto/crypto/ (NEW): replaces whyrusleeping/go-did. parses/generates all the supported atproto key types in all the various formats (did:key, multibase-in-did-doc, hex for signing keys, etc). only works with the curves and formats used in atproto. clear naming for, eg, "HashAndSign" (which does SHA-256 then signs; instead of just "Sign" or "Verify" names)
atproto/xrpc/: generic XRPC client, and possibly some server-side helpers. may also include some subscription (websocket) client and helper code
atproto/repo/ and atproto/repo/mst/: low-level types and algorithms for working with repo DAG structure. agnostic to storage details.
lexicon packages
This is kind of bike-sheddy, and i'm not really certain this is the best way to go. But the current api/atproto/ setup, resulting in package name atproto by default, feels pretty confusing to me.
It really feels like the lexutil stuff (LexBlob etc) should live closer to the actual generated lexicons.
lexicon/comatproto/: the current api/atproto/. types for com.atproto.* lexicons
lexicon/appbsky/: the current api/bsky/. types for app.bsky.* lexicons
lexicon/lexutil/: the current lex/util.
lexicon/gen: implementation code for lexgen
Other Packages
fakedata/: new package with most of fakermaker code. should be in a non-main package for easier re-use in integration tests
plc/: did:plc API client; types for encoding and verifying PLC operations; fake/mock/testing server implementation. could go under atproto/did/plc/?
dbmodels/: only for database models that are actual shared across services (or service components)
cmd/: actual binaries, CLI handling, etc
testing/: any inter-package or high-level tests. open to renaming this (tests/?)
util/: grab bag stuff that still doesn't fit elsewhere
cborgen/ or /gen/cbor/: clearer name for current gen/. alternatively, use the go:generate functionality in golang to do this per-package instead of top-level
Other Changes
move cmd/gosky/util/ to util/cmdutil/
move version/ to util/version/
move util/dbcid.go and util/uid.go to dbmodels/
move util/time.go to lexutil. or maybe copy? and possibly rename to, eg, LexDatetime
move util/fakekey.go to didmgr (or wherever keymgr ends up)
delete testscripts/, or at least move under testing/
@bradfitz some earlier thinking on moving packages around when we get a chance, curious if you have any thoughts/feedback on whether these are idiomatic
I find the current package names and layout confusing, even after working with them for a while. I'm also not always sure where to put stuff. Here is a draft proposal for how to re-organize things, particularly coming out of the recent lex refactor.
Would be good to push these through before too many folks start building on this repo.
Could do this in stages, or rip the bandaid all at once. Timing-wise, would like to get labelmaker landed before doing any of the more disruptive refactors.
ATP Services
bgs/
: "big graph server"; command name isbigsky
pds/
: "personal data server"; command name islaputa
labeler/
: "labeling service"; command name islabelmaker
Service Components
These generally mange state, either on-disk or in a SQL database. Sometimes these align with "ecosystem roles" (a "service" might fulfill multiple "roles").
carstore/
: on-disk repo store (in CAR files), plus SQL database indexingblobstore/
: on-disk file storage, plus SQL database indexingeventmgr/
: handles event subscription from producer side, including persisting the stream (SQL database or other backens) and re-play bufferrepomgr/
: integrates a storage engine, key management, and event generationcrawlmgr/
: rename ofindexer/
, though we might be able to split that functionality intoeventmgr/
(for single consumption, like gosky, labelmaker, and search service) andbgs/
(for crawling multiple endpoints). might be simplest to keep a single implementation which works with multiple upstream endpointsaggrmgr/
: for parts ofindexer
related to persisting notifications, backlinks, etcdidmgr/
(NEW): persisted cache of DID identifies and keys, including both local accounts and remote identities. supersedes previouskeymgr
codelabelmgr/
(NEW): not needed to start, but might end up existing if PDS or other services need to persist and access labels from a SQL databaseatproto libraries
These are specific to atproto and might be reused by third parties.
atproto/identifiers/
(NEW): string wrapper types for DID, NSID, at-uri, handles, cid-str (as a string, not parsed) and other Lexicon-defined types. only string validation, not external code/helpers (eg, no DID resolution stuff). many other packages would import thisatproto/did/
(NEW): replaceswhyrusleeping/go-did
. parse and represent only the DIDs supported by atproto (did:plc and did:web). no crypto! parse DID docs. possibly clients for doing did:web and DNS lookups. not a general-purpose DID library.atproto/crypto/
(NEW): replaceswhyrusleeping/go-did
. parses/generates all the supported atproto key types in all the various formats (did:key, multibase-in-did-doc, hex for signing keys, etc). only works with the curves and formats used in atproto. clear naming for, eg, "HashAndSign" (which does SHA-256 then signs; instead of just "Sign" or "Verify" names)atproto/xrpc/
: generic XRPC client, and possibly some server-side helpers. may also include some subscription (websocket) client and helper codeatproto/repo/
andatproto/repo/mst/
: low-level types and algorithms for working with repo DAG structure. agnostic to storage details.lexicon packages
This is kind of bike-sheddy, and i'm not really certain this is the best way to go. But the current
api/atproto/
setup, resulting in package nameatproto
by default, feels pretty confusing to me.It really feels like the
lexutil
stuff (LexBlob
etc) should live closer to the actual generated lexicons.lexicon/comatproto/
: the currentapi/atproto/
. types forcom.atproto.*
lexiconslexicon/appbsky/
: the currentapi/bsky/
. types forapp.bsky.*
lexiconslexicon/lexutil/
: the currentlex/util
.lexicon/gen
: implementation code forlexgen
Other Packages
fakedata/
: new package with most offakermaker
code. should be in a non-main package for easier re-use in integration testsplc/
:did:plc
API client; types for encoding and verifying PLC operations; fake/mock/testing server implementation. could go underatproto/did/plc/
?dbmodels/
: only for database models that are actual shared across services (or service components)cmd/
: actual binaries, CLI handling, etctesting/
: any inter-package or high-level tests. open to renaming this (tests/
?)util/
: grab bag stuff that still doesn't fit elsewherecborgen/
or/gen/cbor/
: clearer name for currentgen/
. alternatively, use thego:generate
functionality in golang to do this per-package instead of top-levelOther Changes
cmd/gosky/util/
toutil/cmdutil/
version/
toutil/version/
util/dbcid.go
andutil/uid.go
todbmodels/
util/time.go
tolexutil
. or maybe copy? and possibly rename to, eg,LexDatetime
util/fakekey.go
todidmgr
(or whereverkeymgr
ends up)testscripts/
, or at least move undertesting/