KomodoPlatform / komodo-defi-framework

This is the official Komodo DeFi Framework repository
https://komodoplatform.com/en/docs/komodo-defi-framework/
104 stars 95 forks source link

p2p for ordermatching, plans #373

Closed ArtemGr closed 4 years ago

ArtemGr commented 5 years ago

I've learned from Artem that in the ordermatching we're using a feed of orders. This feed is traditional to exchanges and is transparent to users, we show it in the GUIs. The orders that are injected and later updated (?) on that feed are what drives the exchange.

Now, the decentralized key-value abstraction we use around the peer-to-peer implementations (presently libtorrent), it can be used NP for directed communication between two or more peers, but it is less practical for implementing a shared feed of orders (particularly due to delays and races in a large DHT). I'd like us to further evaluate whether there is a solution for a decentralized ordermatching that is withing the existing frame of a decentralized key-value abstraction or if we ought to look for something different.

cipig commented 5 years ago

We have a key-value store in blockchain:

komodo-cli help kvupdate
kvupdate key "value" days passphrase

Store a key value. This feature is only available for asset chains.

Arguments:
1. key                      (string, required) key
2. "value"                (string, required) value
3. days                     (numeric, required) amount of days(1440 blocks/day) before the key expires. Minimum 1 day
4. passphrase               (string, optional) passphrase required to update this key

Result:
{
  "coin": "xxxxx",          (string) chain the key is stored on
  "height": xxxxx,            (numeric) height the key was stored at
  "expiration": xxxxx,        (numeric) height the key will expire
  "flags": x,                 (string) amount of days the key will be stored 
  "key": "xxxxx",           (numeric) stored key
  "keylen": xxxxx,            (numeric) length of the key
  "value": "xxxxx"          (numeric) stored value
  "valuesize": xxxxx,         (string) length of the stored value
  "fee": xxxxx                (string) transaction fee paid to store the key
  "txid": "xxxxx"           (string) transaction id
}

Examples:
> komodo-cli kvupdate examplekey "examplevalue" 2 examplepassphrase
> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id":"curltest", "method": "kvupdate", "params": ["examplekey","examplevalue","2","examplepassphrase"] }' -H 'content-type: text/plain;' http://127.0.0.1:7771/

and

komodo-cli help kvsearch
kvsearch key

Search for a key stored via the kvupdate command. This feature is only available for asset chains.

Arguments:
1. key                      (string, required) search the chain for this key

Result:
{
  "coin": "xxxxx",          (string) chain the key is stored on
  "currentheight": xxxxx,     (numeric) current height of the chain
  "key": "xxxxx",           (string) key
  "keylen": xxxxx,            (string) length of the key 
  "owner": "xxxxx"          (string) hex string representing the owner of the key 
  "height": xxxxx,            (numeric) height the key was stored at
  "expiration": xxxxx,        (numeric) height the key will expire
  "flags": x                  (numeric) 1 if the key was created with a password; 0 otherwise.
  "value": "xxxxx",         (string) stored value
  "valuesize": xxxxx          (string) amount of characters stored
}

Examples:
> komodo-cli kvsearch examplekey
> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id":"curltest", "method": "kvsearch", "params": ["examplekey"] }' -H 'content-type: text/plain;' http://127.0.0.1:7771/

and there is an assetchain, called KV we also run electrums for this chain

artemii235 commented 5 years ago

@cipig Hi, thanks for pointing this out, it can be perfect match for our needs, but I have questions:

  1. There should be no fee to store the data, is this possible? We can implement an automatic faucet request for KV (or other chain that we will use for this purpose), but according to experience with ETOMIC it will add the single point of failure and unwanted maintenance activity.
  2. Can we set the infinite expiration e.g. to store swaps data permanently for statistics purpose?
ArtemGr commented 5 years ago

@cipig This is kind of cool! A normal key-value store might be suboptimal for ordermatching though as we need to consolidate concurrent updates from different nodes (e.g. several nodes adding orders at the same time). I'm considering adding the server-side CRDT support to the "HTTP fallback" key-value store, this should allow us to maintain the concurently updated ordermatching lists in that store and with minimal synchronization overhead (e.g. one and only one store round-trip). Any thoughts on how one would address the issue of concurrent updates with the blockchain-based store? The way I see it, a blockchain client would have to wait quite a bit in order to verify that her version of the update is now indeed a part of the longest chain (and even then we can't really know), retrying the store operations if the store didn't take?

P.S. Store's ability to do a prefix key search, or tagged keys with a tag search, are some of the store features that would as well allow for concurrent ordermatching updates.

cipig commented 5 years ago

Unfortunately i don't know much about KV, just that it exists. I hope @jl777 or @pbca26 can tell us more about that status of KV.

ArtemGr commented 4 years ago

There are two reasons why the DHT is not practical for the orderbook: speed and size. (BitTorrent) DHT is tailored to only support the small entries.

https://github.com/KomodoPlatform/atomicDEX-API/blob/20cda8ccc3e32413b6a813e1c6d03be136e88e76/mm2src/peers/peers.rs#L671

The orderbook, on the other hand, can be large.

The two are related: the size limit means we need to store multiple entries, further reducing the speed and reliability.

One way to tackle the size problem is to use the webtorrent protocol (the file sharing swarm) to download the orderbook.
Peer P¹ gets torrents Tⁿ from the DHT, getting an (eventually or partially consistent) version of the orderbook by downloading the Oⁿ orderbooks via these torrents and joining them (CRDT might be useful in implementing the joining logic).
Peer P¹ then performs maintenance on the orderbook, removing expired entries, sharing gossip received from other peers about the remaining orders, etc.
It then adds its own orders to the orderbook and publishes it as torrent Tᶻ.
This should allow a new peer P² to have a good chance of joining the order matching faster, by getting at least one T from the DHT and downloading it via webtorrent.

It's interesting that there is a BitTorrent extension 38 that would have allowed for sharing the parts of the orderbook between torrents (as discussed in the issue 4012 of arvidn libtorrent), though I doubt that it is implemented in the JS implementation of webtorrent.

jl777 commented 4 years ago

could the DHT be used for nodes to publish a price vector? that would be relatively small and then other nodes could reconstruct partial orderbooks from the DHT entries they have seen.

orderbooks would only need to store prices that are close to the best available, ie. top of the orderbook. that would reduce its size a lot. though you would need to deal with orderbook stuffing where a lot of small orders at lower prices makes the big orders invisible. i guess some sort of volume weighted "top" of orderbook might work.

maybe each node can manage a set of magnet url that points to its latest price vector?

ArtemGr commented 4 years ago

Yeah, that's also a cool idea, to implement the orderbook as a loosely ordered DHT skiplist-like index that a new peer navigates to find the chunks of the interesting price range. Instead of downloading the entire orderbook the peer will only be getting the orders it is interested in (in an ideal situation - just a single order). TY, I'll think about it.

artemii235 commented 4 years ago

Along with price we still need to store at least volume, order creator pubkey and signature which is already over 100 bytes per order (pubkey - 33 bytes, signature - 64 bytes). It means that we can store about 10 orders in 1 DHT entry. Even if we don't store the signature it will be about 50 bytes (or much more) since price and volume are either BigDecimal or BigRational and actually have arbitrary size. If we don't store node or another identification information of node/order we will need additional mechanism of retrieving of such information by price stored on DHT which adds another level of complexity and we will still need to store such information somewhere. Also as it already mentioned DHT can be slow and suffer from race conditions. It might create a lot of problems if we have high frequency trading network when orders are being created, matched or cancelled very fast. 1 of MM2 goals was to provide UX close to CEX, on CEX we typically have (almost) immediate orderbook update when some changes are made, I guess DHT limitations just won't allow us to reach our goals.

ArtemGr commented 4 years ago

Along with price we still need to store at least volume, order creator pubkey and signature which is already over 100 bytes per order

Not really, we can store the price and a small random value that allows a peer to find more information about that order in the DHT.

Even if we don't store the signature it will be about 50 bytes (or much more) since price and volume are either BigDecimal or BigRational and actually have arbitrary size

Chance is we don't need that precision for individual prices of a chunk. We need precision only for the lowest and highest prices there.

If we don't store node or another identification information of node/order we will need additional mechanism of retrieving of such information by price stored on DHT which adds another level of complexity and we will still need to store such information somewhere.

Might be easier than supporting webtorrent.

Also as it already mentioned DHT can be slow and suffer from race conditions. It might create a lot of problems if we have high frequency trading network when orders are being created, matched or cancelled very fast.

BitTorrent DHT network is slow primarily because it needs to travel the huge number of peers to place a piece of data at a random location there. A smaller DHT network will be much faster. If a network upgrades certain peers to caretakers, which is a known trick and was a part of the original p2p design of jl777, then a DHT network of such caretakers might even fit the routing table so you'll have one-trip lookups.

And DHT is a good abstraction as it allows us to design an algorithm that would leave us a certain freedom in picking current and future implementations.

So I'm optimistic about it. And a certain delay is less critical IMHO than the availability of a truly decentralized order matching, even as a backup strategy.

But it's cool if you have something better in mind!

artemii235 commented 4 years ago

Ok, size might be not big issue since we might store the id or hash, but it still should have reasonable size to make the collision chance negligible.

Chance is we don't need that precision for individual prices of a chunk. We need precision only for the lowest and highest prices there.

How are we about to handle it? What if some price is not lowest, but some other orders are matched and it becomes lowest? So the node broadcasted it must track it's state continuously and then update the entry in price vector? We deal with financial software, precision is the most important here. We shouldn't see the different order prices in different circumstances in general.

Might be easier than supporting webtorrent.

We don't have strict requirement to stick with Libtorrent DHT or Webtorrent.

Also regarding DHT race conditions: do I understand correct that to store new value in such vector we must retrieve it first, append new value and then store the entire new vector back in DHT? What happens if 2 nodes broadcast different prices at same time?

artemii235 commented 4 years ago

So I'm optimistic about it. And a certain delay is less critical IMHO than the availability of a truly decentralized order matching, even as a backup strategy.

Yes, it can be considered still, maybe I'm just less optimistic ref this :slightly_smiling_face:

But it's cool if you have something better in mind!

Right now I don't but according to the initial comment of this issue you already had a research on this topic and mentioned few alternatives. Once we start working on general networking refactoring I'd like us to consider all of them setting the required performance metrics first, e.g:

  1. After node is up it should be able to retrieve the orderbooks for desired pair in like 10 seconds.
  2. When new order is created or order is updated or cancelled the update should be propagated to all interested nodes for example in 5 seconds.
  3. The orderbook must be consistent, concurrent order broadcasting by different nodes must not affect each other. Both orders must be displayed on other nodes.

Of course this is just an example, but we can make a research on different choices and check what solution works better in terms of these metrics.

ArtemGr commented 4 years ago

Also regarding DHT race conditions: do I understand correct that to store new value in such vector we must retrieve it first, append new value and then store the entire new vector back in DHT?

Nope, "a loosely ordered DHT skiplist-like index" is not a simple array, it's more like an optimistic graph.

How are we about to handle it? What if some price is not lowest, but some other orders are matched and it becomes lowest?

This is an index that allows a peer to find the prices closest to price X. What matters in this index is the order of the prices withing a chunk and not the precision of them. Precise prices will still be around in the value chunks references by the index chunk.

Right now I don't but according to the initial comment of this issue you already had a research on this topic and mentioned few alternatives

These are alternative DHT implementations, the DHT itself isn't going away, it is the best abstraction we have so far.

jl777 commented 4 years ago

i imagine the ideal solution will be composed of a variety of methods, including timestamped top of orderbooks, pubkeys specific price vectors, maybe other info.

also full precision isnt needed, just to find the pubkey(s) that are most likely to have the current best price (or within a close range to it)

if to try to make all nodes always have a full orderbook, i think that wont work due to bandwidth but if to try to make sure that most of the nodes most of the time have a reasonably close to the best price, that is feasible.

ideally, the design would get better and more complete orderbook state the longer a node is online, but maybe a very good price can be obtained quickly by querying a short list of pubkeys with recent decent prices. and if each node is always adding the best price it knows of, i think it will be very quick to get a good price

artemii235 commented 4 years ago

These are alternative DHT implementations, the DHT itself isn't going away, it is the best abstraction we have so far.

I'm not against the DHT, but my according to my personal experience ref Libtorrent DHT (mostly as user since I'm implementing my code on top of it) I have the impression that it might not match our requirements. But of course if we find out different during further research I will accept it.

if to try to make all nodes always have a full orderbook,

I don't propose nodes to store full orderbook, the idea of fetching the best price orders with some range is perfectly fine. So proposed metrics may be changed to this:

  1. After node is up it should be able to retrieve the best prices for desired pair(s) in like 10 seconds.
  2. When new order is created or order is updated or cancelled the update should be propagated to all interested nodes for example in 5 seconds. (the key word here is interested, if the order is far away for best price other nodes might just not have information about it so they shouldn't receive any updates, but best price order should be updated ASAP).
  3. The orderbook must be consistent, concurrent order broadcasting by different nodes must not affect each other. Both orders must be displayed on other nodes.
jl777 commented 4 years ago

i dont think total consistency is needed. since the nodes will be negotiating directly, the final details on prices maybe can be done directly? it is more a list of most reliable pubkeys with decent prices list, than an orderbook

artemii235 commented 4 years ago

The key point here is

concurrent order broadcasting by different nodes must not affect each other. Both orders must be displayed on other nodes.

So we first need to know that the price exists, but how do we know if it's not present in vector because of race condition?

ArtemGr commented 4 years ago

I'm not against the DHT, but my according to my personal experience ref Libtorrent DHT (mostly as user since I'm implementing my code on top of it) I have the impression that it might not match our requirements.

libtorrent is a temporary step and is not an option for further development, but https://github.com/nazar-pc/webtorrent-dht, for example, has the same chunk size limit, and we might still want the ability to reuse existing networks.


There is value in having a set of metrics and axises, and in prioritizing them in order of importance.

But IMHO we should also avoid design paralysis (aka bike-shedding) and, in terms of PDIA, "encourage positive deviation" (aka set-based design).

Also while designing a decentralizing system the consistency should not be the first goal, because these goals aren't free, if we can't sacrifice some of them then we'll be sacrificing in time and design space.

(I've heard somewhere that the CAP theorem was disproved, so in theory it might be false, but in practice I'm yet to see a good solution that would workaround it).

artemii235 commented 4 years ago

We're designing not just decentralizing system, we're designing the exchange. Where people exchange real money and some of them will generate profits on it. The consistency is important here. I'm as trader would like to know as fast as possible that there's good price order which I'd like to match immediately.

ArtemGr commented 4 years ago

I prefer to pick my battles and I won't be fighting the CAP theorem. If a trader wants precision, they can opt in into seed-based trading, using a central seed node or a set of replicated seed nodes that stops working as soon as there is a partition.

jl777 commented 4 years ago

"concurrent order broadcasting by different nodes must not affect each other. Both orders must be displayed on other nodes."

i dont think all orders need to be on all nodes. the most important is that a node is able to ordermatch at a price that is close to the best, there is not any need for there to be consistent orderbooks, just consistent pricing

i think this relaxation of orderbook consistency will allow for a much lower bandwidth usage. and as long as the absolute best price is found if you are connected long enough, but you get a decent price almost immediately, then i think that solves this issue.

people wont care what pubkey they traded with, just the price. so pubkeys of equivalent reliability and price are essentially fungible with each other.

best to not fight the CAP theorem, i agree

jl777 commented 4 years ago

let us take a step back...

the goal here is to enable people to complete atomic swaps at very close to the best price possible, right?

if we can agree on that, then we should also be able to agree that nothing in that says anything about complete orderbooks, let alone a global consensus on it.

which means this is more about finding the online pubkeys with decent prices, preferably reliable nodes. so maybe instead of orderbook, it should be imagined as the best pubkeys to query for their current bid/ask. and if this is the case and the current bid/ask info also includes the best known bid/ask from other nodes, each request for bid/ask will propagate the best known price.

if you model this, you will find that very rapidly even a group of sparsely connected nodes will converge onto the current best price. issues to deal with are price quotes from unreliable nodes that crowd out the quotes from reliable nodes.

so i think this entire process ties into pubkey reliability also

artemii235 commented 4 years ago

https://en.wikipedia.org/wiki/CAP_theorem#Explanation

In the absence of network failure – that is, when the distributed system is running normally – both availability and consistency can be satisfied.

CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made

So even according to CAP in good network conditions we can guarantee both availability and consistency. The bad network conditions is typical case of course, so we will need to make a choice according to our requirements. I understand that there's always a trade-off.

issues to deal with are price quotes from unreliable nodes that crowd out the quotes from reliable nodes. so i think this entire process ties into pubkey reliability also

Pubkey reliability is more topic of reputation system. If some node with bad networking conditions (even unintentional) somehow gets through and places the order, matches it and then fails the swap it will be banned. The other question is: may we validate other node networking conditions to check it's reliability even before the order is matched and node doesn't have previous swaps?

jl777 commented 4 years ago

here is a possible solution:

  1. create dedicated gossip networks for specific trading pairs (or maybe for specific coins)
  2. design a gossip protocol that propagates a nodes price vector and best known external price and reliability
  3. a new node coming online would connect to the coin specific gossip network, immediately, it will see the highest bid and lowest ask (as all packets in this network include that along with that pubkey price vector)
  4. when a swap is being started, a pre-swap verification protocol with the other node to make sure they are using compatible version, price hasnt changed and volumes are available.

Standard p2p gossip network can be used for this and it will be very close to realtime as to the best price info that propagates. This also allows centralized websites to efficiently connect to a specific trading pair to display centrally the best available prices.

Orderbook can be generated totally locally out of the price vectors that are cached. It is really more debug aid as the ordermatch would be using some sort of metrics to balance price, volume, reliability. however people are used to orderbooks so it is something we should support creating, but just from locally available data. The advantage of this is that there is no large data set that needs to be propagated.

ArtemGr commented 4 years ago

where I'm coming from, yes, the reliability of order matching is much more important than the precision

BTW, I think that most currency exchanges don't work with orderbooks but are rather placing orders against a certain price margin, and I wonder if the way we match against particular orders currently is a future salability issue. the user interface will not be able to handle the orderbooks when there will be hundreds or thousands of orders.

the idea of matching a particular order will quickly become moot once there will be a decent order matching contention. (cool, i picked the order, only five users picked it before me)

Orderbook can be generated totally locally out of the price vectors that are cached

so a virtual orderbook? neat D

jl777 commented 4 years ago

Pubkey reliability is more topic of reputation system. If some node with bad networking conditions (even unintentional) somehow gets through and places the order, matches it and then fails the swap it will be banned. The other question is: may we validate other node networking conditions to check it's reliability even before the order is matched and node doesn't have previous swaps?

I guess i am saying that i feel at least part of the best price propagation should include aspect of reliability, as that is a critical component of ordermatching. Clearly given two equivalent prices, a node would want to ordermatch a pubkey that has thousands of swaps completed, verus a newbie one. So maybe it can just be a count of successful swaps, with timestamp of most recent success.

given a node that hasnt had a successful swap, or not had one in a long time, it seems prudent to do some checking to make sure that a swap will complete.

maybe there is a setting as to the elapsed time from last completed swap that would make a node similar to a newbie node. so we get three types of nodes: a) has completed swaps, and recently too b) has completed swaps, but not recently c) never completed swaps

and the ordermatch process takes into account the precedence based on the above. for b) and c) it would do what it can before commiting to the ordermatch that there is a very good chance it will work if a swap is started.

jl777 commented 4 years ago

the idea of matching a particular order will quickly become moot once there will be a decent order matching contention. (cool, i picked the order, only five users picked it before me)

ordermatching needs to be based on price and volume, with the ordermatch process filtering out unreliable ones and really, if the user says price X is ok, then any pubkey with price of X or better, is ok. of course, if we can find a much better price, then that can be used.

in mm1, what i did was start an auction process. so a node would announce (can be via the gossip network) "i want price X", then the nodes that are set to accept that price all respond back. the original node then waited for 10 seconds or so, extending the time a few seconds whenever it got a better offer.

This allowed each ordermatch to get the best available price and automatically from active nodes, and it avoids the entire issue of having to synchronize orderbooks

ArtemGr commented 4 years ago

if the user says price X is ok, then any pubkey with price of X or better, is ok

cool, makes sense and changes the way we think about it, I'm glad we hashed this today

jl777 commented 4 years ago

by using the gossip network to propagate the pre-ordermatch auction, then nodes can see if a better price is already submitted and avoid sending a response.

can you think of a more bandwidth efficient way to find the best currently available price?

it assumes there is a gossip network that only has nodes that care about a specific coin (or trading pair for high volume trading pairs). it seems pretty clear that the pubkeys of the nodes with the best prices, along with the best prices can be very quickly determined.

orderbooks can be created using locally cached data, but it is really only for informational purpose to give the user an idea of what the current prices are. it shouldnt be part of the actual ordermatch process.

then either an open auction (fully decentralized) or a request for quote to a specific set of the top pubkeys to find the node with the best price, volume, reliability match. i guess to sort it by some metric

then to iterate down that list in order, until an ordermatch starts.

bandwidth used for a specific ordermatch is only the number of nodes that have a better than the best seen price, plus some direct point to point traffic.

the overall gossip network is only broadcasting relatively small packets and only when there is a price change, or a quote times out. say one hour time limit. but if a new price isnt top of the orderbook for a node, then it wont even need to be propagated. still some edge cases to this, but i think well within debugging distance.

even with thousands of nodes for a specific trading pair, that would be tens of thousands of packets per hour, which might be too much bandwidth for phones on cellular networks, so i guess we need a mode where only the pubkeys are broadcast, which means price changes would only arrive via the extra data that is added to all the packets. as long as there is a way to get the current highbid/lowask, really, there is no need to know the pubkeys until a specific order is started.

and the auction process is very efficient in bandwidth, even with thousands of maker nodes, especially if they dont send a response unless they have a better than the best seen bid. even if they miss it and send it, the auction starting node will just filter it out

jl777 commented 4 years ago

OK, to summarize, orderbook is only for getting a sense of the current prices and not directly involved in the ordermatching. this avoids all the orderbook syncing issues.

then by isolating traffic, it eliminates the NN explosion of packets. we do need an efficient way of broadcasting a packet on a gossip network, but i think that is possible to do pretty efficiently, certainly not NN

given that and a packet design to add the best known bid/ask with each packet, instantly any node in the network will find the current prices. and that info can be used by the user to form a taker order. which then starts the auction process among the makers.

the above formulation, reduces the impossible problem of keeping all the orderbooks in sync across all nodes, to just knowing the current best price and a streamlined auction process. that also has the effect of making it fully decentralized.

so a phone would register for only the current price gossip stream, that is really all it cares about. locally it could generate a pretty complete orderbook if it wanted after enough time. but having the best price immediately is all that is critical. then the ordermatch of course directly involves that node, but during an idle state, maybe it doesnt even receive any packets. only when it wants to know the current price, does it query a "fullnode" to get the latest price.

"fullnodes" for a specific price pair would need to get a steady stream of pricing info, so that packet flow needs to be optimized, by deferring all the order specifics until the taker auction, i think it will allow it to be very streamlined.

artemii235 commented 4 years ago

the idea of matching a particular order will quickly become moot once there will be a decent order matching contention. (cool, i picked the order, only five users picked it before me)

If you pick the very best price on CEX and it goes away and there're no orders with same or better price depending on your settings your order will be either cancelled or stay on orderbook as maker. MM2 is converting the order to maker one's automatically by default now.

if the user says price X is ok, then any pubkey with price of X or better, is ok in mm1, what i did was start an auction process. so a node would announce (can be via the gossip network) "i want price X", then the nodes that are set to accept that price all respond back. the original node then waited for 10 seconds or so, extending the time a few seconds whenever it got a better offer.

MM2 already works this way partially, it doesn't wait for 10 seconds as of now, but still can match with better price. And this better order is actually not guaranteed to be shown on orderbook previously. So actually yes, we might be not eager to display the all available best price orders.

OK, to summarize, orderbook is only for getting a sense of the current prices and not directly involved in the ordermatching. this avoids all the orderbook syncing issues.

Now we're back to technology selection for implementation :slightly_smiling_face:

ArtemGr commented 4 years ago

MM2 is converting the order to maker one's automatically by default now

this doesn't scale IMHO

can you think of a more bandwidth efficient way to find the best currently available price?

the problem with MM1 was that it only worked in the presence of open ports and didn't account for routing anomalies between the peers. DHT reverses the networking (a peer as a client uses the DHT network as a set of servers; some of these servers will be available through open ports, limited NAT hole punching and routes not affected by anomalies). I'm not sure if mapping a gossip network protocol to DHT is the fastest design approach here; after a number of iterations it will likely produce good results, but one is often tempted to implement an algorithm designed for DHT from scratch.

but an option we haven't sufficiently bashed here yet =D is to reintroduce the notion of the caretaker nodes with open ports. in MM this notion was refactored out because it was mixed with maker/taker relationships, preventing two peers both lacking an open port from trading with each other. but for the order matching we'll likely need to reintroduce it and to use the elected caretaker nodes for the order matching. in the presence of a large amount of orders this will effectively shard the system.

artemii235 commented 4 years ago

this doesn't scale IMHO - then we should find a way to scale this since it's one of core exchanges feature. The only thing we need to add is user selection: either he would like order to stay in orderbook until explicitly cancelled by user or cancelled automatically if not matched.

ArtemGr commented 4 years ago

it's more of a workaround to matching explicit orders that a feature. when the order matching will work against a price rather than against a specific order (see the idea of a virtual orderbook above) this workaround will be designed away as a bonus

jl777 commented 4 years ago

in the presence of a large amount of orders this will effectively shard the system.

as long as each shard as approximately the same highbid/lowask, it wont matter

having a traditional p2p network that has no phones, but just normal unix nodes that dont have connectivity issues. this can be the "fullnode" that gossips about the price quotes for pubkey, best prices and they can be queried by lite nodes as to the current price.

not sure what to do about the case where a phone has the best price and it cant connect to the gossip network... maybe it pushes its price onto the gossip network via a fullnode and then it can be discovered by the taker node.

some sort of mapping of the taker auction to the DHT networking i guess is the unsolved problem

ArtemGr commented 4 years ago

not sure what to do about the case where a phone has the best price and it cant connect to the gossip network...

if a network is partitioned and peer A can't talk to peer B through the DHT (or caretakers) then there's nothing we can do about it, the peers should skip each other

jl777 commented 4 years ago

is there a way to do the taker auction using the DHT?

jl777 commented 4 years ago

i think we can assume that the gossip network wont be partitioned, as it would be composed of long term maker nodes and not running on phones.

and since these are highly connectible nodes, i would think that we can require that to join the gossip network, your node needs to be able to be connected to from phones?

in that case, it seems we are one step away

artemii235 commented 4 years ago

@ArtemGr

it's more of a workaround to matching explicit orders that a feature. when the order matching will work against a price rather than against a specific order (see the idea of a virtual orderbook above) this workaround will be designed away as a bonus

This is not workaround, please try to login to exchange and create the order, you'll see something like this (Good till cancelled means that your order will be automatically converted to maker): image image

And also please recheck the messages above. MM2 is not matching against specific order. It broadcasts the request to network and may receive few messages back and then match against the other order that has better price. But if there're no matches at all it may be converted to maker order and stay in orderbook forever until matched if user chooses so.

is there a way to do the taker auction using the DHT? - I think yes, every match is already unique entry in local MM2 RAM, which can be saved on DHT by unique key and taker and maker might exchange messages using this record.

jl777 commented 4 years ago

ok, so at this point does it seem reasonable to make a decentralized and highly scalable ordermatch system where the price will be close to the best price available?

ArtemGr commented 4 years ago

is there a way to do the taker auction using the DHT?

we use a large DHT for the peers to communicate during the swap and it makes sense to keep it that way, but the order matching will be easier to implement on the caretaker nodes methinks

ok, so at this point does it seem reasonable to make a decentralized and highly scalable ordermatch system where the price will be close to the best price available?

we'll be designing and implementing one for the WASM port soon, that's why I resumed thinking on it today. we'll have a WASM version working in Flutter and using the new order matching. it remains to be seen how scalable it'll be, this is a secondary goal to me, but we'll try to factor this into design.

p.s. experience shows that during the implementation some areas of design space can become unavailable, while new ideas, opportunities and shortcuts might become available, so i'm not trying to do a waterfall here, but rather gathering ideas and context, and in this regard we did well

ArtemGr commented 4 years ago

MM2 is not matching against specific order. It broadcasts the request to network and may receive few messages back and then match against the other order that has better price

Good to know, Artem. With all the focus on precise matching lately it wasn't obvious to me that we up-match. But the conversion from Taker to Maker still seems like an implementation detail to me. Anyway, I'm actively using pair programming sessions now and I'll be sure to invite you when we'll come to the nitty-gritty of the WASP order matching design.

artemii235 commented 4 years ago

@ArtemGr Thank you! I was just too eager ref precision and consistency so I can understand where your impression comes from :slightly_smiling_face: When I was coding the current ordermatching I was keeping in mind that on CEX I was always matching with better price even if I set non-better price manually so implemented same in MM2 according our like CEX UX goal.

artemii235 commented 4 years ago

But the conversion from Taker to Maker still seems like an implementation detail to me. - yes, actually it's implementation detail since it's the only mode of MM2 operation now, but it's better to refactor it allowing user to choose and possibly even change the default mode.

artemii235 commented 4 years ago

I've been thinking about and I guess there's a potential problem with 3rd party DHT implementation usage: How may we prevent malicious nodes from filling the DHT price vector with completely invalid values? Even when we have reputation system MM2 nodes will know that specific pubkey is banned, but how the DHT network will know that this pubkey is banned and must not have ability to place orders at all? Is this real that such pubkey(s) will be constantly filling the price vector with invalid data so entire orderbook will be stuck?

jl777 commented 4 years ago

packets need to be signed pubkeys with no history only used as last resort, available coin balance can be checked the DHT could be filled with invalid data, but the mm2 doesnt have to query it, does it?

the price vector i imagine is pubkey, prices[], signature, timestamp, bestknown price

maybe a few more fields, but probably oriented toward pubkey. so other nodes can find out this pubkey has a pricevector of [...]. then the issue is which pubkeys do we care about, but that would be based on the bestknown price data

artemii235 commented 4 years ago

the price vector i imagine is pubkey, prices[], signature, timestamp, bestknown price - didn't we come to conclusion that to save space due to very limited DHT chunk size we should omit everything except price itself and short order identification information for further lookup? So it's first something like [price, id], and only then we look for order info by this id? According to the Artem's comment:

Along with price we still need to store at least volume, order creator pubkey and signature which is already over 100 bytes per order Not really, we can store the price and a small random value that allows a peer to find more information about that order in the DHT.

ArtemGr commented 4 years ago

I've been thinking about and I guess there's a potential problem with 3rd party DHT implementation usage How may we prevent malicious nodes from filling the DHT price vector with completely invalid values?

there is no theoretical difference whatsoever between a 3rd party implementation and ad-hoc implementation in terms of tampering, same attack vectors can be used on both. in practical terms an ad-hoc implementation is more likely to have flaws and shortcuts that can be used against it as compared to a public network that survived for a dozen years in a hostile environment. bittorrent DHT uses encryption and a number of heuristics I haven't yet looked into to protect itself from malicious tampering, BTW.

thanks for bringing this issue up though, might be handy as a reminder

didn't we come to conclusion

there can be no conclusions in this discussions, it is not a proper media for designing complex protocols. conclusion = working implementation. conclusion = implementation that survives evolutionary process and test of time. here we are sharing priorities, ideas and concerns


i wonder if here we have a a good example of a virtual order book, from the user experience perspective: https://hitbtc.com/KMD-to-BTC

artemii235 commented 4 years ago

ad-hoc implementation is more likely to have flaws and shortcuts that can be used against it as compared to a public network that survived for a dozen years in a hostile environment.

I'm not talking about attack on DHT itself, I'm talking about DHT doesn't know anything about our specific rules and just can't follow them by design. How will the DHT know that it must not accept the order from malicious pubkey? This pubkey won't look malicious for DHT so DHT will accept it, but it might broadcast data that is malicious for MM2 nodes.

artemii235 commented 4 years ago

there can be no conclusions in this discussions, it is not a proper media for designing complex protocols. conclusion = working implementation. conclusion = implementation that survives evolutionary process and test of time. here we are sharing priorities, ideas and concerns

Ok, we came to idea that we might store just price and short id in price vector but then we're coming back that it should possibly contain the signature and pubkey which makes order vector item size over 100 bytes that is impractical?

ArtemGr commented 4 years ago

DHT doesn't know anything about our specific rules

most of the time you don't need to change a hashmap implementation to add rules to it. you can use a wrapper class instead

and if you need to change a hashmap, you might be better of changing an existing one instead of reinventing the wheel

but that's just words. if you have a good from-scratch implementation (that works in webview and browsers via webrtc) then by all means, let's use it. if you have an idea for it, I'd like you to share it

Ok, we came to idea that we might store just price and short id in price vector but then we're coming back that it should possibly contain the signature and pubkey which makes order vector item size over 100 bytes that is impractical?

not really.

option 1 chunk was) unknown header, array of (full price, etc -> order)

option 2 chunk was) unknown header, maximum and minimum prices, compressed array of (price, small random number)

note that i didn't delve into specifics of such a compression

option 3 chunk now is) signature and encryption stuff in the header, -ibid-