KomodoPlatform / komodo-defi-framework

This is the official Komodo DeFi Framework repository
https://komodoplatform.com/en/docs/komodo-defi-framework/
104 stars 94 forks source link

Request the transaction history from UTXO RPCs only when balance is changed. #612

Closed artemii235 closed 4 years ago

artemii235 commented 4 years ago

https://github.com/KomodoPlatform/atomicDEX-API/blob/9884f181d8f990f57208484f1fdb20e9ed0e0882/mm2src/coins/utxo.rs#L1485

The tx_history fetching loop starts as background thread to download address transaction history providing instant access from local database. It might consume a lot of traffic in Electrum case - blockchain-scripthash-get-history method always returns full history of address which might contain a lot of records. As of now this request is repeated every 30 seconds again and again even if there're no new records.

We can easily avoid it by checking balance - the result has way lower size, so in general it'll look like:

Always request history on first loop iteration, also request balance, remember it.
On all next iterations request balance first and then request history only if balance is changed
artemii235 commented 4 years ago

Would also nice to measure traffic usage using address with significant number of transactions before making optimizations. Then measure it again after optimization is done and provide results here. It might be done as internal metric collecting the data transferred by ElectrumClient, specific design decisions are up to you, we may discuss them in person when you start working on this task.

ArtemGr commented 4 years ago

In https://github.com/ca333/komodoDEX/issues/349 we're going to gather such metrics via decentralized gossip. Having the history traffic measurements available to us would be very useful (and will open the data for SQL analysis, such as comparison between coins and MM versions).

The easiest way to share these measurements with the gossiping UI code is probably to log them. The log subsystem has a tag format that is intended as a stable medium for UI to parse (and we're using it in AtomicDEX mobile UI to learn of the swap progress, for example)

https://github.com/KomodoPlatform/atomicDEX-API/blob/34dd28dbc82b2f0dd7fcd2d49d920c11b5be3028/mm2src/common/log.rs#L300

https://github.com/KomodoPlatform/atomicDEX-API/blob/34dd28dbc82b2f0dd7fcd2d49d920c11b5be3028/mm2src/common/log.rs#L591-L593

sergeyboyko0791 commented 4 years ago

There are several implementation questions:

What should be the emotion, the tags are right and are useful?

ArtemGr commented 4 years ago

So the history loading is something that has a beginning, works for some time and then has an end, right? I mean, once we loaded a history for a coin we don't need to do that anymore, or if we do, it's unlikely to take much traffic? If yes, then this sounds like a good fit for a dashboard entry. Dashboard will print the current version of the status periodically and then will print it last final time when the operation is marked as finished (IIRC). Like it does for the state of the swap.

And it was designed specifically to track ongoing continuous operations with minimal interference to the normal tracing / signal-oriented log.

Should have probably mentioned it earlier.

But I offer to record once a minute count of sent bytes sounds a lot like what dashboard does, except it groups these continous operations together, tells us how long an op was active, separates them from the signal-oriented log.

Should we calculate not only payload bytes but also IP headers?

Most probably not. If history loading takes a lot of traffic then we'll be thinking on how to reduce it but we won't be thinking about what IPs it uses. But we should probably add the coin's ticker to the tags.

Should only Electrum traffic measure be logged? What about Eth and Native?

Mobile UIs not using native, but we'll be needing the ETH stats just as much as the UTXO ones.

[traffic electrum send] 62

The parsable data should all be in tags, like [history coin=KMD tx=62 rx=105], and the human-readable part should be human-readable, so the full dashboard entry might look like [history coin=KMD tx=62 rx=105] Loading KMD history

cf. https://github.com/ca333/komodoDEX/blob/c2ab90a6c4d60df79f8437e966a703fd069ac9cd/lib/services/mm_service.dart#L453

P.S. I think that dashboarding the background operations in MM is a good idea even regardless of the traffic measurement, and to boot it saves us from doing extra periodic requests to MM, but I should probably mention that it's also feasible to implement this by adding the traffic fields to https://developers.atomicdex.io/basic-docs/atomicdex/atomicdex-api.html#my-tx-history ?

artemii235 commented 4 years ago

So the history loading is something that has a beginning, works for some time and then has an end, right? I mean, once we loaded a history for a coin we don't need to do that anymore, or if we do, it's unlikely to take much traffic?

History loading repeats over time, we need to fetch it again and again from coin RPC when balance is updated. The balance check is now missing so tx_history_loop fetches history every 30 seconds from Electrum which always returns full transactions history without paging support (that is already mentioned in the initial description of the issue).

How often we should record to log the measure? I added the log recording on each sent and received packet. I offer to record once a minute count of sent bytes. Else we can record on each specified send/received bytes (eg every 5 kibibytes)

As Artem mentioned you may create a status similar to https://github.com/KomodoPlatform/atomicDEX-API/blob/7d10c66c033ae4e6cfccaea318e2bddc3db7ea35/mm2src/lp_swap/maker_swap.rs#L1018 https://github.com/KomodoPlatform/atomicDEX-API/blob/7d10c66c033ae4e6cfccaea318e2bddc3db7ea35/mm2src/lp_swap/maker_swap.rs#L1036 So it will be periodically printed to log unless status handle is dropped.

But what I'm unsure of: we may have dozens of active coins, won't the log be overloaded with dozens of periodical messages since history fetching loop actually never ends? E.g.

[history coin=KMD tx=62 rx=105]
[history coin=RICK tx=62 rx=105]
[history coin=MORTY tx=62 rx=105]
...
[history coin=BTC tx=62 rx=105]

Should we calculate not only payload bytes but also IP headers?

IP headers size calculation is not mandatory, it's negligible overhead.

Should only Electrum traffic measure be logged? What about Eth and Native?

It will be also nice, we can extend this idea not only to coin RPC clients traffic measurement, we actually need other metrics, e.g. ordermatching protocol traffic usage, and these metrics might be not only network related. We possibly need an abstraction for such purpose, there're also metrics and more widely used prometheus crate. I think it might be nice feature to integrate a widely used OS solution for metrics collection, it already has pretty visualization capabilities. But we should be sure that our metrics collection works without the requirement to setup 3rd party solution server.

ArtemGr commented 4 years ago

History loading repeats over time, we need to fetch it again and again from coin RPC when balance is updated. The balance check is now missing so tx_history_loop fetches history every 30 seconds from Electrum which always returns full transactions history without paging support

I see. Nasty! Thanks for the reminder! Yes, printing this for all the coins every 30 seconds is not what I had in mind, but printing a sum [t-history tx=62 rx=105] Loading coin transaction history via dashboard might work and will inform the MM users of this recurring activity, its timing and network overhead.

The metrics crate is rad.

sergeyboyko0791 commented 4 years ago

As far as I understand the metrics crate suits better for our purposes than the prometheus. The first one doesn't require a 3rd party server and allows to easily implement a custom metrics exporter. Eg records the metrics via dashboard.

But what I'm unsure of: we may have dozens of active coins, won't the log be overloaded with dozens of periodical messages since history fetching loop actually never ends? E.g.

The best way I see is collect metrics and record it every specified time (eg 5 min or less for the tests). It'll allow record the history traffic value of each coin and to avoid the log overload.

[history coin=KMD tx=62 rx=105] Loading KMD history

Recording the log entry above may be little difficult, because the count of request and response bytes should be passed from corresponding connection loop to the RPC client API. But it may be useful later, not only for measuring the history request traffic. If it is planned in the future, the way is suitable for this. Else I can add other history metrics:

[history coin=KMD tx_requested=100 tx_detail_requested=8] Loading KMD history

where the tx_requested is size of tx history list (the list contains just tx hashes) received over the time, the tx_detail_requested is count of tx detail received over the time for the history purposes.

I also suggest to measure and record all RPC client traffic.

ArtemGr commented 4 years ago

A couple of ideas.

cc @cipig

cipig commented 4 years ago

I would like to avoid running additional stuff on the electrums, if that is possible. Have we already tried https://electrumx.readthedocs.io/en/latest/protocol-methods.html#blockchain-scripthash-subscribe ? Do we really need to query the tx_history all the time? It is not needed to do swaps, just to show txes in the app. Can we get this info from an explorer instead? The explorers are designed for such requests.

artemii235 commented 4 years ago

@cipig I agree, Electrum HTTP proxy is not mandatory.

Using HTTP means the communication will be more reliable on mobile operators (some of which aren't as friendly to non-HTTP traffic)

AFAIK we didn't get any reports that Electrum connectivity is getting blocked by any provider.

Can we get this info from an explorer instead? The explorers are designed for such requests.

Sure, we may integrate block explorers API too in the future (e.g. Insight).

Have we already tried https://electrumx.readthedocs.io/en/latest/protocol-methods.html#blockchain-scripthash-subscribe ? Do we really need to query the tx_history all the time?

As I can see from docs blockchain-scripthash-subscribe notifies us about status update which is actually sha256 of scripthash transactions list, so we will need to request tx_history anyway to understand what exactly has changed.

And in summary the idea of such tx history fetching from Electrum was to avoid dependency on 3rd party blockchain explorer services. So to have history you need to run Electrum open source server only which has standardized API. Whereas each block explorer may have it's own API, some of them are closed source, may limit requests to them etc.

ArtemGr commented 4 years ago

AFAIK we didn't get any reports that Electrum connectivity is getting blocked by any provider.

And we won't, until our network is larger. By which time redesigning this will be much harder.

We know such providers exist though. For example: "Claro is choking access to TCP/UDP ports because of paranoid policies on their NAT" - https://wiki.vuze.com/w/Bad_ISPs. The two-layered decentralized technology we're developing with caretakers is consequently designed to prefer HTTP, which also benefits future WASM ports and simplifies our mobile and web stacks.

I would like to avoid running additional stuff on the electrums, if that is possible.

I take it the tx history information is public and hence we can proxy it from any server, not necessarily from the electrum ones?

Can we get this info from an explorer instead? The explorers are designed for such requests.

If the transaction history happens to take much traffic then we'll have to redesign it per https://github.com/ca333/komodoDEX/issues/349. One of the possible routes there is to opt out of the MM history retrieval and switch to explorers or to caretaker proxies. Which explorers should we look at?

artemii235 commented 4 years ago

One of the possible routes there is to opt out of the MM history retrieval and switch to explorers or to caretaker proxies. Which explorers should we look at?

There was a request some time ago to use open-source Insight explorer API as 1 of UTXO coins operational mode, it's in backlog as of now, it will be implemented later - https://github.com/KomodoPlatform/atomicDEX-API/issues/618

AFAIK we didn't get any reports that Electrum connectivity is getting blocked by any provider. And we won't, until our network is larger. By which time redesigning this will be much harder. We know such providers exist though. For example: "Claro is choking access to TCP/UDP ports because of paranoid policies on their NAT" - https://wiki.vuze.com/w/Bad_ISPs. The two-layered decentralized technology we're developing with caretakers is consequently designed to prefer HTTP, which also benefits future WASM ports and simplifies our mobile and web stacks.

Electrum servers can use 80/443 ports, Electrums support WS protocol which is fine for pure browser environments. So even if all non-HTTP ports are blocked we can solve the problem by simple reconfiguration. Users also can use TOR proxy or VPNs to mitigate the problem. And I would like to deal with 1 thing at time, this issue is about history fetching optimization with Electrum operational mode and corresponding metrics collection, not about Electrum can be potentially blocked.

artemii235 commented 4 years ago

And what is also important: we're not only persons using Electrum. Electrum based wallets and server is popular software that is used by millions people to manage their coins, I guess if there were any massive blocking attempts there would be some news about that, however I'm not able to find such reports by searching isp block electrum servers or isp blocks electrum. Some people report of course that their wallet won't connect but there's no proof that such connection is blocked by their ISP.

ArtemGr commented 4 years ago

I cede you the point, Artem. In the realm of cryptocurrency, Electrum is among the more used technologies, it is Firewall friendly, using the 80/443 ports, and we might very well follow its pattern.

(Speaking of which, is there already an issue to use Electrum ports 80/443 in MM?)

And it follows that we don't need to proxy Electrum in its entirety.

But Electrum pagination is a known problem which brings us back to the issue at hand. Any thoughts on proxying Electrum tx history in order to reverse, paginate and compress it?

P.S.

I guess if there were any massive blocking attempts there would be some news about that

Human behavior follows certain patterns or established social habits ("rules of fair play" per Jordan Peterson). One of such patterns is that people talk about providers in the forums that discuss the providers. It is exceedingly rare to see provider gossip overflowing into subject of specific technologies and their implementation. Then again, as software designers we might need to be a bit smarter than that.

Note also, that we're not talking so much about blocking but about traffic prioritization, which is actually hard to detect, sneaky.

And if something does exists, it does not follow that you (or any of us) have heard about it. That would be a false premise and I sincerely advise you to use a better logic.

artemii235 commented 4 years ago

(Speaking of which, is there already an issue to use Electrum ports 80/443 in MM?)

Not yet, it requires additional DNS configuration, current electrum servers addresses maintained by cipi follow the electrum1-3.cipig.net:PORT pattern, it should be changed to e.g. kmd.electrum1.cipig.net:80 etc. @cipig is it possible to add such DNS for electrums keeping old addresses available too?

And if something does exists, it does not follow that you (or any of us) have heard about it. That would be a false premise and I sincerely advise you to use a better logic.

Ì'm not thinking that if I didn't heard about something it doesn't exist, I'm making an assumption that if there were any massive (which is very important word here) block it's highly probable that someone would report this. But of course, "Bad ISP DB" reporting some providers tend to block any non-HTTP ports is sufficient argument to prepare for this and use 80/443 ports.

It is exceedingly rare to see provider gossip overflowing into subject of specific technologies and their implementation.

Vuse "bad ISPs DB" is example of such overflowing 🙂 And there's similar community supported list by TOR. Maybe something similar already exists in context of blockchain-related projects. Or maybe we will create something by ourselves sometime 🙂

cipig commented 4 years ago

Not yet, it requires additional DNS configuration, current electrum servers addresses maintained by cipi follow the electrum1-3.cipig.net:PORT pattern, it should be changed to e.g. kmd.electrum1.cipig.net:80 etc. @cipig is it possible to add such DNS for electrums keeping old addresses available too?

That will not work, since all coins are running on the same server (or 3 of them), and each server has only one port 80.

artemii235 commented 4 years ago

But Electrum pagination is a known problem which brings us back to the issue at hand. Any thoughts on proxying Electrum tx history in order to reverse, paginate and compress it?

@ArtemGr I think It's not a big issue if we fetch the history only when balance is changed. Also there's a protocol idea to add method that will allow to fetch history from specific block height: https://electrumx.readthedocs.io/en/latest/protocol-ideas.html#blockchain-scripthash-history, so when this is implemented we will save the block number for which we have the history already and request only new transactions afterwards. Even if this method is not implemented we can add it by ourselves to at least cipi ElectrumX fork, update our electrums and then submit a PR to the base repo (hoping it will be accepted so more coins will support it). But before taking such actions I would like to start collecting metrics first and then base our decisions on them.

ArtemGr commented 4 years ago

Good idea on collecting the metrics! Having them will be a nice addition to the "order liveliness" package. I think we can get them already by parsing the logs, such as

00:00:16.760 mm_service:421] mm2] 19 21:00:16, rpc_clients:1261] Electrum client connected to electrum3.cipig.net:10001
00:01:16.764 mm_service:421] mm2] 19 21:01:16, rpc_clients:1296] "electrum3.cipig.net:10001" error "rpc_clients:1218] Didn\'t receive any data since 1587330016. Shutting down the connection."
00:01:16.807 mm_service:421] mm2] 19 21:01:16, rpc_clients:1261] Electrum client connected to electrum3.cipig.net:10001

added to my list.

sergeyboyko0791 commented 4 years ago

@ArtemGr I want to do sort the tags while converting &[&TagParam] to Vec<Tag>. https://discordapp.com/channels/@me/687147433003974657/705371359307497513

This comment is two years old. Can you say is it important to keep tags unordering? if I added tags sorting, UI applications would work correctly?

ArtemGr commented 4 years ago

Might affect https://github.com/ca333/komodoDEX/blob/e3a1d2ab39edb7719ee58f829d11a910277f1705/lib/services/mm_service.dart#L453, but it's NP as currently the UI is bundled with a specific version of MM, so we'll just update the UI code if the ordering of swap tags is flipped.

artemii235 commented 4 years ago

Implemented, closing.

ArtemGr commented 4 years ago

Any hints on how we can use the new measurement functionality to measure MM traffic?

artemii235 commented 4 years ago

The documentation has been published recently, I think recommended way for GUIs is to call API and parse the metrics from JSON: https://developers.komodoplatform.com/basic-docs/atomicdex/atomicdex-tutorials/atomicdex-metrics.html#api-calling

ArtemGr commented 4 years ago

Awesome, thanks!

sergeyboyko0791 commented 4 years ago

@ArtemGr, I decided not to reorder the dashboard tags. I hope you haven't changed GUI's code.

ArtemGr commented 4 years ago

@sergeyboyko0791, Nope, everything's intact. Appreciate the update. P.S. Looked at the PRs, but there was too much for me to parse D