bitcoin / bitcoin

Bitcoin Core integration/staging tree
https://bitcoincore.org/en/download
MIT License
78.42k stars 36.16k forks source link

Libbitcoinkernel Project Tracking #27587

Open TheCharlatan opened 1 year ago

TheCharlatan commented 1 year ago

Project Board: https://github.com/orgs/bitcoin/projects/3

This is the tracking issue for the libbitcoinkernel project. The original tracking issue is found in https://github.com/bitcoin/bitcoin/issues/24303.

The libbitcoinkernel project is a new attempt at extracting Bitcoin Core's consensus engine. The kernel part of the name highlights one of the key functional differences from the deprecated libbitcoinconsensus and in fact, most libraries: it is a stateful library that can spawn threads, do caching, do I/O, and many other things that one may not normally expect from a library.

This statefulness is necessary for libbitcoinkernel's decidedly incremental approach to extracting our consensus engine. This approach favors:

  1. Reusing existing code ...which allows us to be continually integrated with Bitcoin Core and benefit from our extensive test suite

  2. Incremental decoupling instead of building from scratch ...which allows us to avoid having to prematurely optimize for a "perfect" boundary or API (tends to be highly subjective, non-obvious, may lead to unproductive bike-shedding before we've even done anything meaningful)

The work of extracting the validation engine into a library and making the API ergonomic is likely to be a multi-release project involving multiple contributors. The incremental approach takes this into account and respects the sheer size of work (both in writing code and getting it through review) that needs to be undertaken.

PRs

Please see the project board: https://github.com/orgs/bitcoin/projects/3

The Game Plan

Stage 1: Extracting out a usable libbitcoinkernel.{so,dylib,dll}

The first stage of this project can be considered as completed. This bitcoin-chainstate executable uses the validation engine and its build system code reveals the minimal set of files needed to link in to use the consensus engine as-is. Over time, these files were further pruned to only include functionality that is strictly required for validation. Future coupling of validation code with non-validation modules will result in linker errors.

The mempool is not decoupled, because some users of libbitcoinkernel may want to have an embedded mempool with Bitcoin Core's policies. Instead these kind of "Bitcoin Core specific" functionality (also includes assumevalid, assumeutxo, and checkpoints) should be completely optional and configurable by the user.

The current design of the mempool within the validation code requires the boost headers to be exported. This is not ideal, since it forces the users of the kernel library to include the boost headers too if they wish to use this functionality. Removing them has been attempted in https://github.com/bitcoin/bitcoin/pull/28335, but the approach taken has received mixed review. In future, this might be solved by only exporting a C header to users, which would internalize any exported boost symbols.

libbitcoinkernel could also be used as an internal library for libbitcoin_node. The desired library organization is shown in doc/design/libraries.md. This is attempted in #28690.

Stage 2: Ship an external kernel API

Now that an internal kernel library with a limited feature set exists, expose it to external users with a C header. Start with a reduced header, exposing just enough functionality to build a utility tool. Ideas for such a utility tool could be a rewrite of bitcoin-chainstate using the C header's API, or a reindexer tool. These tools could also be written in other languages, such as Rust or Python. Care should be taken that the API is as consistent as possible in all respects, manages its own memory to the extent possible, minimizes type conversions between C and C++, and keeps versioning in mind.

Ideally, users looking to integrate with libbitcoinkernel will provide inputs on which library features might be desirable to have exposed. The API will initially have a very idiosyncratic, Bitcoin Core-specific interface. Continual polishing over multiple versions will incrementally make the libbitcoinkernel API more ergonomic for users outside of Bitcoin Core. Note though, that there is a possible dichotomy here between optimizing the library interface for external users and for the internal interface within Bitcoin Core.

Another goal of this final stage of the project could be to ship an "IO-less" version of the library. This would mean that the block store and coins database would have to be abstracted such that the user can provide their own implementations. It should optionally also not use any threads, atomics, or otherwise platform-dependent features. This would allow the library to target bare metal environment such as targeted by riscv-unknown-elf-g++.

Getting libbitcoinkernel Through Review

The project will touch a lot of code that will not have been touched in some time and might resurface older issues. Any outstanding review comments not pertinent to the main thrust of a PR will be taken note of in the TODO section below and then either triaged in an issue or addressed in a separate PR.

Action Items

  1. If you have any questions, please post them below!
  2. If you have any ideas for the future direction of "Stage 2: Polishing the API / Continual De-coupling", please leave a comment below, I'd love to talk!

Project-wide TODOs

These are suggestions for further cleanup and improvements that came up during review:

Other various items that arose during review and should be tracked

willcl-ark commented 1 year ago

Would it make sense to update the project page to https://github.com/orgs/bitcoin/projects/3/views/1 (which if I'm not mistaken seems to be the current one)?

Could keep a link to the old project https://github.com/bitcoin/bitcoin/projects/18 for reference?

TheCharlatan commented 1 year ago

Re https://github.com/bitcoin/bitcoin/issues/27587#issuecomment-1568419077

Would it make sense to update the project page to https://github.com/orgs/bitcoin/projects/3/views/1 (which if I'm not mistaken seems to be the current one)?

Thanks for the notice, updated to the new one.

Could keep a link to the old project https://github.com/bitcoin/bitcoin/projects/18 for reference?

There is no useful extra information on the old board that is not in the new one, so I don't think this is necessary.

darosior commented 1 year ago

@TheCharlatan sometimes mentions sourcetrail as a tool to visually inspect the currently required kernel headers. I've set it up and figured it could be worth sharing with anyone who's interested in having a quick look to the trail of headers without having to set up the tool themselves.

The list of headers included in libbitcoinkernel as of current master d2ccca253f4294b8f480e1d192913f4985a1af08: bitcoin-chainstate_sourcetrail

fanquake commented 1 year ago

Nice. Played around with this. If you merge a couple things, and turn on c++20, so we can use <bit> (see corys branch), bitcoin-config.h disappears:

less_headers

TheCharlatan commented 5 months ago

Updated with initial description of stage 2.