bisq-network / bisq-pricenode

GNU General Public License v3.0
6 stars 12 forks source link

Exchange rate provider ceased operation on one pricenode. #33

Open ghost opened 1 year ago

ghost commented 1 year ago

Problem description

The poloniexTs timestamp from devinpndv... fell way behind all other timestamps., e.g:

"poloniexTs" : 1696048031673,
"poloniexCount" : 8,
"coingeckoTs" : 1697130011586,
"coingeckoCount" : 46,
Epoch date Human-readable date (GMT)
1696048031 2023-09-30 04:27:11
1697130011 2023-10-12 17:00:11

I checked the logs, and the poloniex provider was not being called. It was just not at all present in the logs, whereas all other providers were logging their calls once per minute. The logs did not go back far enough to when it initially ceased invoking the poloniex provider, so no information about why it ceased.


Suggestions

  1. When pricenode detects that a timestamp is stale, it should not use the stale prices. I thought this had already been handled in https://github.com/bisq-network/bisq-pricenode/pull/19 but that scenario is the provider returning no rates from the query, here the provider is not being called and therefore the last known rates remain.
  2. Restart pricenode once per 24h, similar to what is done in seednode, or
  3. Have pricenode detect that a timestamp has gone stale, and exit the process to start a new one.
  4. There's too much logging of outlier filtering data, fills up the logs too fast and obliterates other valid logging. It correlates with client requests. Change that logging so its output at a similar pace to the provider updates, i.e. once per minute.

I'll work on a PR to address this.

ghost commented 1 year ago

[UPDATE] Looks like this issue also happened on emzy node, this time the Binance feed ceased operation for approx 20 hours. At the time it stopped, the following error was reported:

Oct 28 23:55:47 pricenode bisq-pricenode[681]: #033[0;39m#033[31mOct-2823:55:47.197 [Timer-15] WARN  b.p.s.p.Binance: refresh failed java.lang.StackOverflowError: null
Oct 28 23:55:47 pricenode bisq-pricenode[681]: #011at java.base/java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1045)
Oct 28 23:55:47 pricenode bisq-pricenode[681]: message repeated 1023 times: [ #011at java.base/java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1045)]
Oct 28 23:55:47 pricenode bisq-pricenode[681]: #033[0;39mjava.lang.StackOverflowError: null
Oct 28 23:55:47 pricenode bisq-pricenode[681]: #011at java.base/java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1045)
Oct 28 23:55:47 pricenode bisq-pricenode[681]: message repeated 1023 times: [ #011at java.base/java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1045)]
Oct 28 23:55:47 pricenode bisq-pricenode[681]: #033[34mOct-28 23:55:47.228 [Timer-21] INFO  b.p.m.p.MempoolFeeRateProvider$First: Retrieved estimated mining fee of 10 sat/vB and economyFee of 4 sat/vB from mempool.space

After 20 hours it started polling binance rates again as if nothing had happened. The outage caused binance rates to remain stale and result in some price discrepancies against other nodes.

alvasw commented 11 months ago

The JVM throws the StackOverflowError Exception when the the recursive calls get too deep and the stack is exhausted. Here it seems like the refresh() calls deadlock/get stuck and the Timer calls refresh() many times until the stack is exhausted. I created a PR that doesn't queue new refresh() calls when the previous didn't finish yet.

This doesn't fix the root cause! We should investigate why refresh() gets stuck.

alvasw commented 11 months ago

PR https://github.com/bisq-network/bisq-pricenode/pull/43 potentially fixes the stale prices problem.