Closed hannahhoward closed 3 years ago
I'm keen for more information to help us make a decision here - right now I'm cautiously in favour. In particular I'd like more info about concrete expected benefits, and our understanding of the state of completeness and test thoroughness for go-ipld-prime
.
Since we are already depending on it transitively through graphsync selectors, switching to it wholesale will reduce the total amount of code.
I note that, in informal discussion elsewhere, the idea of using a different, single-purpose CBOR-encoding library has been raised, with the intention of further reducing code surface and improving performance of the state-tree-update-persistence hot path. This hasn't been investigated deeply, though.
Tiny sitrep (specifically in regard to the mention of going for other single-purpose encoding libraries): I'm confirmed for-real-hands-on-the-keyboard-It's-Happening putting cycles into codegen now. Still early so any estimates I might have for delivery dates would have several sigma on them, but it's moving. This means we can anticipate getting codegen'd -- therefore nonreflective, therefore potentially much faster -- marshal and unmarshal code paths. So I'd advocate to double-down and spend resources on this over breaking any new ground in a separate parallel direction.
It's likely we'll need either codegen -or- some work to make go-ipld-prime
Node
interfaces that are backed by refmt'ian atlas
es before filecoin can fully use go-ipld-prime
. Both of these tactics would result in reusable features in go-ipld-prime
. A third option is to write some manual functions which use go-ipld-prime
Node
directly and manually flip the data into filecoin's native golang structures.
I think @whyrusleeping has more opinions on the costs of the latter two than I do. (ISTM that the last time he toured me around the filecoin codebase, there were way more custom "transform" funcs than I expected, which means much of the work of an option 3 has already been... more or less done, to my surprise. So those funcs would need massaging, but would be roughly topologically similar to the existing ones..?) Also worth noting I'm a tad wary of the choice between options 1 and 2 because they take substantially different development paths and almost totally disjunct sets of mental context (one is very reflect
-heavy; the other is very not), so at least for my own brain-seconds budget, it's definitely a "pick one" scenario.
In general, if there's a way to spike things out in a relatively noncommittal way (ideally, that's also benchmarkable without exercising a full filecoin application?), that might be particularly constructive.
I'm definitely around for a more synchronous discussion about this :+1:
@hannahhoward if you want to champion this, I think the next step is a design doc laying out the potential benefits and costs, and work involved.
Description
For some time, @warpfork and the IPLD team have been working on a replacement library - https://github.com/ipld/go-ipld-prime - for
go-ipld-cbor
. This library is the place for most futuregolang
IPLD development, and the place important features like IPLD Selectors and encoding/decoding improvements are likely to land. We have already started to introduce go-ipld-prime in the chainsync process through GraphSync, which relies on go-ipld-prime to implement selectors. Replacinggo-ipld-cbor
withgo-ipld-prime
would at minimum allow us to significantly reduce the number of serialization / deserializations that occur during a Graphsync fetch during chainsync. Further,go-ipld-prime
could allow us to make several optimizations to the serialization / deserialization process in general, which might result in several speed improvements throughout the codebase.Acceptance criteria
This issue serves a simply a starting point for discussions of what a replacement of
go-ipld-cbor
withgo-ipld-prime
might look like. While the ultimate acceptance criteria is probably thatgo-ipld-cbor
is not a dependency ofgo-filecoin
, the transition would likely happen in stagesRisks + pitfalls
go-ipld-prime
is a new, in development library that is not fully mature. @warpfork can probably shed more light on what is done vs not done.performance characteristics of
go-ipld-prime
are not well known.go-ipld-cbor
is heavily integrated throughout the code base -- there may be lots of hidden dependencies that aren't fully revealed until we attempt to transition.Where to begin
Perhaps simply a spike attempting to replace as many uses of
go-ipld-cbor
would be a good starting point.Synchronous discussion with @warpfork to discuss
go-ipld-prime
and what an integration might look like.