Reduce cognitive overload

justindotpub commented 1 year ago

This issue is admittedly subjective, but it is an attempt to reduce my own personal cognitive overload, to make it easier for me to wrap my mind around the code. Nothing major or even "wrong" with the code... just a list of things to possibly change or where not possible, find ways to alleviate through more doc or whatever.

What makes the code hard for me to follow?

Short variable names that aren't sufficiently descriptive, like block for blockSender, ...
Inconsistent naming between source and sink events, etc... is this solved now?
Having to keep in my head which states exist, understand which events cause which state transitions
Having to understand what happens in each state, what states can't be transitioned to until some other transition takes place... i.e. dependent transitions that might even block
Lack of visualizations for what is happening
Some lack of "intuitiveness" around things like e.g. where blocks were already in the blockstore before added to pending... because of need for raw blocks to be blocks in order to have children in order to run the computations that must happen only after we have access to their children. Perhaps this is just a desire for more descriptive names, or I need to create a writeup of the overall algorithm.
Generics add some additional mental overhead, can't entirely be avoided
Interfaces add some additional mental overhead, can't entirely be avoided, like the interfaces vs the structs that implement one or more interface. Single entities matching more to human story telling.
Having to look at the core algorithm implemented in core, plus the specifics in batch, just making it hard to look in a single place and see what is happening.

ETA: 2023-12-31

justindotpub commented 1 year ago

Dumping messy notes here for later cleanup.

Better docs and diagrams
In general, making it easy to implement is a goal, per spec. Anything in this category.
General task of making current code better organized, easier to reason about.
Clarify precisely what we aren't implementing today, at the very least for puposes of knowing what we can benchmark. e.g. we don't have any session cache on the provider / responder currently, or on requester after a session is complete or for any concurrent but overlapping requests. And no cold call bloom currently.
- This info should probably be added to our milestones as well. Checklists. Easy place to see exactly what is done and what remains.
Code refactoring, getting abstractions right
- Requester vs responder
- Orchestrator / state and where that is passed around
- Events and state, what goes in core vs batch, think through if it makes more sense to just keep separate loops and state etc vs trying to have a single unified approach
Terminology, naming (e.g. sessions, or events, or Batch*... see notes)
- Requester / provider vs requester / responder?
- Terminology - are have filter and want list bad? Does want list imply not just roots? i.e. have is all CIDs, want is just roots.
- Rename diff back to ptr?
Instrumentation code refactoring, I'm thinking the wrapped / duped structs should just be removed and put the instrumentation code directly where we want it, like other projects I've seen. Less maintenance as well.
BlockChannel and StatusChannel refactoring in test... ugly stuff I rushed in related to getting access to orchestrator
batch and http separation
Make sure batch abstractions are clear, with distinct rounds (request / response), not merging that logic with streaming in such a way that things are muttled and hard to decipher, like I've been running into.
Reorg abstractions to avoid channel / streaming assumptions about the world messing up code, like it did with channels and the ability to send data even though in http you need a request in order to respond
Moving util stuff into dedicated packages
Remove unused code
Go patterns we aren't following
- Contexts
Remove unnecessary error handling, like where things just can't fail
For error handling, always handle error case first, then the easy to see happy path is further to the left
Think about what client side usage of go-car-mirror would look like outside of kubo. e.g. if say Bluesky wanted to use it.

justindotpub commented 1 year ago

Inconsistent placement of sync/atomic logic... makes it hard to know where to wrap and where not
- NewSourceSession takes a filter but doesn't wrap anything in mutex locks.
- NewSimpleStatusAccumulator takes a filter but wraps it in mutex locks.
- Those were the examples I wrote down, but in general, make approach to sync vs non-sync code consistent so it is easy to know what to do.

justindotpub commented 1 year ago

Cleanup config
- Merging of config for diff parts of the system, like http vs responder config

fission-codes / go-car-mirror

Reduce cognitive overload #86