bluesky-social / atproto

Social networking technology created by Bluesky
Other
6.44k stars 449 forks source link

brainstorm ideas for Cool Developer Tools #562

Closed bnewbold closed 1 year ago

bnewbold commented 1 year ago

I love Cool Tools! There are some nifty existing tools for wrangling IPLD objects, CIDs, etc. It would be nice if we eventually had similar things for atproto repos, XRPC APIs, Lexicons, etc.

This issue is a wiki-like brainstorm of ideas for nifty developer tooling.

CLI/UNIX Tools

MST tree visualizer: prints out a graph of nodes, showing depth, number of entries, leading-zeros, CIDs, etc. Maybe a "compact" form and a "verbose" form? Could be horizontal (matches "classic" way of thinking about trees) or vertical with indentation (easier to dump large trees). Could also export graphviz or something. Bonus points for doing "diffs" between two trees, with some kind of color/visualization. Would be helpful for docs, interop tests, etc.

MST and/or repo to JSON object(s): I think the IPFS/IPLD tools (kubo?) can already convert a CAR or arbitrary blockstore to JSON files on disk. A possible improvement, for debugging and testing, would be to dump a single giant JSON object, with CID "links" being nested sub-objects. There might already be a known format or best practice for this. Then could use jq to pretty-print; would map to things like python dicts for integration testing; human-inspect-able format for test vectors. Could include actual CID values. Another mode would allow editing the JSON (invalidating CID links), and recomputing CID links. Maybe a "JSON patch" variant to show diffs between trees.

repo dump to/from folder: for well-behaved repos, a way to take a CAR file and unpack it into DAG-JSON files in filenames/folders which match the MST key names. And a way to parse such a thing, along with a DID and signing key, into a repo CAR file. Should reproduce. Useful for inspecting and sharing test vectors, also for folks to work with their exported CAR files. Maybe a variant would create a .tar.gz in memory from the CAR, as a parallel export format. Optionally a "standard" way to store any extra metadata and keys in the base directory. Optionally a "standard" way to store blobs (images, etc) in the same directory and .tar.gz file.

mount repo as FUSE filesystem: DAG-JSON objects at MST key paths. could be local (working on a blockstore or CAR file) or remote (PDS via XRPC). listing, reading, writing should work. Validation errors would be some IO error I guess. Could even mount the full atproto world with DID prefix directory (!). Blob support?

lexicon converters: like pandoc, but for schemas? lexicon-to-... openapi-v3; JSON schema; protobuf interface. Opinionated/controversial?

Daemons / Proxies / Services

Some of these could be simple features in PDS implementations.

GraphQL proxy to XRPC: could be codegen or not

RSS feeds to/from bsky: the trivial thing is making author feeds available as RSS (and/or Atom, JSON feed). a general-purpose bot would consume an RSS feed and push to bsky (why's hnbot maybe already does this?)

Micropub interface to PDS: micropub is a relatively simple indieweb HTTP protocol for posting microblog-like content. There are several mobile apps. Much simpler than activitypub, IIRC. Might unlock a small ecosystem of bots, apps, integrations, etc? But might also be extremely niche.

Web Tools

Simple web interfaces to all the CLI things above, works by passing in a DID or ATURI to inspect.

Lexicon verifier: paste in a lexicon, get told if it is valid and if not why. then, in a second box, paste in arbitrary JSON and get feedback on whether it matches the lexicon or not, and if not why, in a human-readable form. Should be able to do this in-browser with typescript implementation, or compile whatever else to WASM. The Rust ecosystem has some nice validators that generate human-meaningful error messages; i'm sure similar things exist in other ecosystems.

DID+ATURI debugger: plug in a DID or ATURI, the web service live attempts to resolve the DID and connect to PDS; for ATURIs also tries to fetch and verify content. Similar to web tools like BGP looking glass; whois lookup; SSL score; "is my email setup (MX/SPF/DKIM/etc) working right". Might need some rate limits?

Existing Stuff

I'm pretty happy with the CLI "API" (argument structure) for adenosine-cli, which is basically general purpose com.atproto + app.bsky CLI: https://gitlab.com/bnewbold/adenosine/-/blob/main/extra/adenosine.1.md

Some notes on using kubo (IPFS CLI tool) to inspect MST tree: https://gitlab.com/bnewbold/adenosine/-/blob/main/notes/ipld_car_explore.md

dholms commented 1 year ago

Really like all of these ideas!

For things that folks could jump into now: I'd mainly focus on CLI/UNIX tools. The repo is fairly set at this point, but we'll see some churn in Lex/XRPC in the coming weeks. Other stuff should be ready to build on top of in the next month or two.

snarfed commented 1 year ago

https://granary.io/ currently supports converting app.bsky.* objects to/from a wide variety of formats, eg AS1, AS2, Atom, RSS, HTML, and more. It's both a pip-installable Python library and a REST API. You can play with it interactively on the web here: https://granary.io/?input=bluesky&output=as2#url-form

Here are a couple examples:

snarfed commented 1 year ago

Also https://github.com/snarfed/lexrpc is a Python implementation of XRPC + Lexicon. It fully supports new-gen Lexicon, but doesn't yet do full schema validation. Soon!

dcsan commented 1 year ago

there's some brainstorming in the "steal my idea" section here https://discord.com/channels/1097580399187738645/1101027724392411156

dcsan commented 1 year ago

firehose to JSON

many of the firehose tools eg https://github.com/CharlesDardaman/blueskyfirehose/blob/main/firehose/hose.go seem to dump binary blobs in CAR format. would be nice to have a more readable stream output.