Open dr-orlovsky opened 4 years ago
LOL, just wanted to open a new issue since I missed this one, but got a nice heads up from GitHub. Here's what I wanted to write:
I was thinking for some time that it might be useful to standardize at least a minimal RPC protocol for the most basic operations:
The current situation of each LN implementation having their own RPC is quite terrible as it causes a lot of code duplication or things like only Eclair supports Turbo channels, but RTL doesn't support Eclair, so I can't have both.
I'm not sure whether it's better to take one of the existing protocols and specify minimum features or create a new set of highly-formalized protocols. (Was thinking modeling on Rust strong type system with serde.)
End of what I intended to write.
I like your summary! Will think about it. One thing that comes to my mind is: you seem to want to avoid code generation for a good reason. Code generation has some nice benefits. Can we get the benefits of codegen AND security of less codegen? First idea that comes to my mind:
Let's not forget that codegen also provides security - the long history of manually-written parses with various vulnerabilities or annoying compatibility/logic bugs should be a sufficient argument. :)
Orthogonal issue: how to connect the various services together easily? On same machine, I did the interface files proposal. I'm thinking about how to enable remote communication as well. Ideally not requiring people to configure each service separately. (That means connect to my remote electrs
and ecclair
from laptop in a single step.)
One thing that comes to my mind is: you seem to want to avoid code generation for a good reason. Code generation has some nice benefits. Can we get the benefits of codegen AND security of less codegen?
It's not only me, it's the most of the dev community in the sphere of consensus-important protocols. I've risen this question some time ago with bitcoin core; and after that with other parts of the community. Sometimes it goes up to Satoshi quote: https://bitcointalk.org/index.php?topic=632.msg7090#msg7090. But all agree to avoid codegen in all parts that are related to consensus-important parts (including P2P protocols/APIs). One of discussions you may find here: https://github.com/rgb-org/spec/issues/84. Another one is here: https://t.me/rgbtelegram/1470 all the way up to here: https://t.me/rgbtelegram/1522
However it still can be used in any client-facing APIs without any problems! But that would be vendor-specific (which does not include the fact that it can be standartized over the industry, like another LNP/BP standard). And I am up for a work to do it!
In this regard, your points are good working one. Let's try to experiment with that.
Orthogonal issue: how to connect the various services together easily? On same machine,
I am already thinking and a bit experimenting about ZMQ DLS for IPC in rust. May be can be done with simple derives, without codegen. Would be pleased to join the forces in that effort. Here is my current take on it: https://github.com/LNP-BP/lnp-node/tree/master/src/msgbus
Here is a sample of how data structure definition based on ZMQ can look like: https://github.com/LNP-BP/lnp-node/blob/8a95459a898f595fd1087e6a338d33e64090bd0b/src/msgbus/proc/connect.rs#L22-L53
And here is RPC part:
It is without derive's yet; but they can be quite simply added; @elichai did a great crate derive-wrapper
https://github.com/elichai/derive-wrapper which he gladly can extend to cover those many From
s that are required.
Good points about consensus, I agree completely. I guess maybe if there was a special tool for that it might be feasible, but quite likely not worth the effort.
My interest of "doing something with API" is primarily about client RPC, not P2P. I think codegen should be preferred there at least for one additional reason: automated codegen across different languages.
I'm not really sure it's worth reinventing the wheel here, unless all solutions are very bad. gRPC seems to be the leader here, but there are other interesting projects. I quite like Capnproto, but didn't try it out myself yet. I also like the simplicity of just deriving serde, but that could be sub-optimal as that can't be translated to other languages.
Another interesting option is Swagger, but I'm not sure if forcing HTTP is a good idea. TBH, I'm not a huge fan of HTTP for RPC, since it adds overhead without significant value. Yes, browsers can use it, but browsers can easily use websockets translated to whatever other transport is used. I even made a tool for unix sockets: https://github.com/Kixunil/ws-unix-framed-bridge
Regarding the orthogonal issue, I didn't mean the communication protocol, but "configuration protocol". I'm currently thinking about using interface files with environment variables. Something like this: INTERFACE_LNP_BP=/etc/interfaces/lnp_bp
, but that also needs a meta file so that we know what interfaces the application can work with.
Agree on API thing.
Few comments (in general alignment with what you've said):
In general, after a decade of REST popularity, RPC strikes back b/c of ZMQ/microservice architectures in Enterprise and Websocket/push/publication models for Web. Bitcoin and Lightning models are also not that much resourceful, rather procedural, so let's stick for RPC-like solutions leaving REST for edge cases like blockchain explorers etc.
What's exactly the problem with hashes and public keys? gRPC has bytes data type, so that should work, or am I missing something?
Yeah, let's avoid REST if possible.
It's preferred to have a fixed-length serialization for them to avoid in-flight data modifications/attacks. Also endianness play a trick sometimes with them.
In general, very good talk on data serialization design in APIs can be found here: https://developer.apple.com/videos/play/wwdc2018/222/
@Kixunil In implementing bp-node and lnp-node (bitcoin and lightning nodes, implementing things required for RGB & many other L2/L3 stuff, including DLCs, PTLC etc) I found that the priority number 1 for me (in terms of APIs) is a serialization for common data structures (primitives + bitcoin/lightning-specific) for ZMQ. Will be working on it from next week. Right now I am contemplating of writing custom Serde binary serializer as a simplest and fastest option. Do you have any considerations/other suggestions?
My original take was just to implement From
s for zmq::Message
<-> data types (I gave some samples in the comments above); but this does not work smoothly with primitives and bitcoin/lightning data types, since they are external and in rust you can't impl
external trait (like From
or TryFrom
) for an external type (which are defined by rust-bitcoin and rust-lightning libs). You can do with a wrapper, but it creates so much boilerplate code, that I assume usage of Serde (which already can do all that boilerplate code with derive
s) is the best way forward.
Actually, got a better idea than Serde. Will post solution soon.
In general, very good talk on data serialization design in APIs can be found here: https://developer.apple.com/videos/play/wwdc2018/222/
Yeah I do pretty much exactly what he's talking about, just with Results for error handling (thanks God for it, it's much more elegant than set serro and return nil)
I think it shouldn't be hard to persuade other friendly libs to accept (feature gated) serialization PRs. Worst case, serde has deserialize_with
.
Anyway, looking forward to your idea!
Sometimes rust compiler gives so much pain with generics, unlike C++...
Here is my current state of experimentation (not working yet): https://github.com/LNP-BP/rust-lnpbp/commit/cf2317b72d04195e8ca588c2c6dcbe7deee0cac4
Spent the whole day... The bottom line is: any two auto trait A
implementations with generic on two distinct traits B
and C
always fail; even if all three traits are local: compiler either says "upstream may implement this trait" (if some of them is not local) or "downstream may implement this trait" if they all local. So it just not works. I'm talking about this:
https://github.com/LNP-BP/rust-lnpbp/commit/cf2317b72d04195e8ca588c2c6dcbe7deee0cac4#diff-1f71c41ac92514987c842bb92e7a92cfR52-R80
I.e. this will always fail:
trait A { }
trait B { }
trait C { }
impl<T> A for T where T: B { }
impl<T> A for T where T: C { }
Even negative traits are not working as promiced: impl<T> !B for T where T: C { }
gives compiler error; however (suprisingly!) impl !B for dyn C { }
compiles without any problem, but does nothing!
And this is the case when you have marker traits that allow you to separate distinct types; otherwise it's a simple orphan trait impl problem...
Serde dealt with that using plenties of macros, including derives, basically generating wrappers for any type.
BTW, #[repr(transparent)]
are not working — neither for generic wrappers, nor for a simple ones...
It is without derive's yet; but they can be quite simply added; @elichai did a great crate
derive-wrapper
https://github.com/elichai/derive-wrapper which he gladly can extend to cover those manyFrom
s that are required.
Actually, they are already covered: https://github.com/elichai/derive-wrapper/issues/2
I think I understand what you're trying to do, but not 100% sure. There's a technique I invented using marker trait. It's used in embedded-hal
and I use a modified version of it in parse_arg
as well.
I wrote a trivial demostration of the marker trait idea for this case.
An annoying thing about it is as you can only define implementation of your trait in terms of one trait (e.g. I can't provide ParseArgUsingTryFrom
marker that'd defer to TryFrom
), but I think I have a workaround for that. Going to try it out.
Good news, I've figured it out!
It needs a few marker types, but nothing terrible. The advantages of that approach:
Encode
for any type you'd like if that type impls some other interesting trait already.impl
no_std
-compatibleEncode
using the same trait slightly differently. For instance, you could somehow transform error types in one impl. (But at last, the type still has to have a single implementation of Encode
)In case my last point (in parentheses) bothers you, another alternative is to just parametrize Encode
by Strategy
with a default. Then it's possible to have a wrapper for forwarding. I did that in fast_fmt
But it's probably not nice in situations in which wrappers are not nice too (no experience with that). Maybe it's possible to somehow combine the two ideas, but it sounds very wild. :rofl:
Very good concept indeed! Thank you! Will try to apply it.
What I am trying to do is to get ability to pass data structures between processes - or through client-server APIs with binary ZMQ (both REQ/REP and PUB/SUB) at the lowest coding cost. In case of bitcoin/lightning related protocols (and RGB) it means that I can just "inherit" already existing serialization methods for bitcoin data structures (blocks, transactions and related stuff) in bitcoin wire protocol - and BOLT-related serializations; also RGB-related client-validated data serializations. So you are right, I am trying to gather under the same hood several binary serializators (already implemented elsewhere), and I am sure that each of the data structures I am using have one and only one serializator available. With this serializators, I am just construct a single ZMQ message per data structure, and for complex requests I am just assembling them into a multipart packets (feature of ZMQ).
At the end of the day I hope it will end up with simplified concept for RPC API definition: jus a rust struct with few derives
to generate required From
s. This rust code can be used as a DSL to generate a code for other languages as well in an automated fashion.
I'm very happy to hear that! According to your description, the solution looks really good.
I will need to look at ZMQ better (I have some small prior experience) to see if there's more that can be done about it.
Using Rust struct as DSL is something that I was thinking about too, but so far it feels like it'd be quite hacky. I'm still willing to give it a second look.
Good news, I've figured it out!
I started with generics in 1995, when Borland was just doin their first versions of Turbo C++ supporting generics ("templates"), and followed generic concept development through these decades... But your code scares the shit out of me :) I am trying to comprehend it
I will need to look at ZMQ better (I have some small prior experience) to see if there's more that can be done about it.
Actually ZMQ is so damn simple that there is nothing to look at. What is does is almost under the hood and does not affect data structures anyhow: ZMQ lib manages to make network communications reliable with message queues. This has no implications for the code: you are just doing usual binary sockets which simply do not fail if the remote is not there, and perform 100% async. They also support all the flavors of many-to-many communications without you noticing that.
So you may thing of ZMQ as a real async TCP or UDP (working over IPC sockets/file streams as well) where your messages can be multipart (consisting of number of packages) — and you always know that you get the whole (multipart)message, not a part of it.
When thinking about Rust generics, you just need to think in terms of logic. The math keywords are already there for
, where
. impl
basically means "there exists exactly one". Another way to look at it is they are type-level functions (but I personally don't have this view very naturally in my head).
The trick with Helper
is a workaround for Rust not being able to recognize these impls:
impl<T> Trait1 for T where T: Trait2<Assoc=A> {}
impl<T> Trait1 for T where T: Trait2<Assoc=B> {}
do not overlap. We know that because T
can have only one impl of Trait2
, then different Assoc
implies different T
, but rustc currently doesn't understand that. Relevant issue: https://github.com/rust-lang/rust/issues/20400
If rustc could understand that, there'd be no helper and we would just write:
impl<T> Encode for T where T: Into<Message> + T: EncodeUsingOtherTrait<Strategy=IntoStrategy> {
// ...
}
I hope the thing above is significantly clearer. :)
Fortunately, rustc knows that Helper<T, A>
is different than Helper<T, B>
, so we implement Encode
for the Helper
and then we implement Encode
for all types that can use Helper
with a strategy defined by the EncodeUsingOtherTrait
associated type.
Hope this helps, let me know if you need more help understanding something.
One more thing worth noting: coherence requires us to have Encode
, EncodeUsingOtherTrait
and the blanket impl in the same crate.
Another thing that I consider important: it'd be possible to just use a tuple instead of Helper
, but I think it'd be confusing for people and may be problematic if you wanted to implement Encode
for actual tuples.
Yes, this gives a very good intuition into the matter, let me meditate overnight on it
@Kixunil It has worked! It's a kind of magic 💯 Thank you very much for finding a way of implementing such features!
API design considerations
Background
There multiple different types of API that can be used by the software stack related to LNP/BP. Here we analyze criteria to choose the proper API technologies and serialization standards for different cases.
In general, software might require API for:
Today, many different API description languages, serialization formats and transport layers exist that may be used in the mentioned scenarios. However, in most of the cases the choice of the particular formats are nearly arbitrary or related to historical reasons. Here I'd like to systematize criteria for API technic selection in LNP/BP for future apps that may allow to avoid many bad practices of the past.
Overview
API components
The classical API consists of three main components:
Many existing API automation frameworks (see below) cover more than a single API component.
API Protocols and Frameworks
Here we provide information only about modern and most recently used frameworks:
IPC for Microservices
The requirements for this are:
Much less important for the protocols:
ZeroMQ seems to be a tool of choice for the transport layer, which have to be combined with custom RPC API DSL and serialization protocol.
Client-server (non-web)
ZeroMQ seems to be the tool of the choice here as well
Web-based REST
OpenAPI seems to be the tool of the choice.
Web-based RPC
WAMP seems to be the tool of the choice for apps that require live updates (Websockets).
Another alternative to consider is GraphQL, however it should be noted that id usually has a poor performance and is not suited for Websocket apps.
End notes
Protocol buffers or Apache Thrift serialization can't be used in all of the cases due to:
Original: https://github.com/dr-orlovsky/notes/blob/master/api_design.md