KomodoPlatform / komodo-defi-framework

This is the official Komodo DeFi Framework repository
https://komodoplatform.com/en/docs/komodo-defi-framework/
103 stars 94 forks source link

pricesattempt to subtract with overflow #315

Closed ArtemGr closed 1 year ago

ArtemGr commented 5 years ago

Stack trace: https://gist.github.com/Kiruel/00e52d08328ca8f097111a62977b32ab. Discord original. The source code is probably at bb94f1b548c68e6b9fead3f1505c8f963878dee1 (in the mm2-cross branch).

artemii235 commented 5 years ago

Are there 2 panic messages that overlap each other? :slightly_smiling_face: thread 'COREthread '' panicked at 'pricesattempt to subtract with overflow' panicked at '', attempt to subtract with overflowmm2src/coins/utxo/rpc_clients.rs', :mm2src/portfolio/portfolio.rs:7231177:24 There 2 stack traces that are actually related to time difference calculation: portfolio.rs - if now_ms() - last_price_broadcast > 10000 rpc_clients.rs - if now_ms() / 1000 - last as u64 > ELECTRUM_TIMEOUT I expect that this can happen only if system time went backwards for some reason, is it?

ArtemGr commented 5 years ago

cc @Kiruel

> I expect that this can happen only if system time went backwards for some reason, is it?

Looks like this might be possible.

There's an account of this code being too complex to be reliable:

"I worked on the kernel clock functions for a while. The way it's supposed to work is that the MONOTONIC clock never goes backwards. That's sort of by definition. In practice, MONOTONIC is implemented as an offset from REALTIME. Any time the realtime clock is adjusted, or the system sleeps, this offset is adjusted. Under the wrong circumstances, the offset could be miscalculated, resulting in a jump -- even a backwards jump -- of the MONOTONIC clock. The built-in realtime clock in most hardware only provides time to the nearest second, while Linux keeps time internally to the nearest microsecond or nanosecond. Linux tries to resolve these issues when going to sleep and waking up, but the logic is quite convoluted and I don't understand it. Add in adjustments to REALTIME caused by NTP adjustments, user time adjustments, and who knows what else, and you have a recipe for chaos. Bottom line is that I'm not really surprised that MONTONIC sometimes goes backwards." – Edward Falk, https://stackoverflow.com/questions/3657289/linux-clock-gettimeclock-monotonic-strange-non-monotonic-behavior

And there's at least one known issue: https://stackoverflow.com/a/3657433/257568. There might be kernels having it in the wild.

And we're not even using the MONOTONIC clock, we're using the normal UNIX time which has no guarantees, buggy or otherwise, of not going backwards.

We should probably add a monotonic clock function to common, for algorithms requiring the monotonic time. Simply checking against a global atomic integer might be simpler and more reliable than using the kernel's MONOTONIC clock. Thoughts?

artemii235 commented 5 years ago

@ArtemGr Adding MONOTONIC clock function to common will help us with our own code that we have good knowledge and full control of. But can we be sure that all dependencies that we rely on would work well if there is gap in system time for any reason? What if MM2 is started with incorrect system time, but it changes to correct then, does it mean that our monotonic clock will return incorrect time in such case? Moreover checking time correctness might be even more important, bad things can happen if time is not correct from the beginning and MM keeps using it. Summary:

  1. The most important is to check time correctness if possible (using external sources maybe, but there are also concerns about trusting a 3rd party). If time is not correct on MM2 start we should not allow it to run.
  2. Monotonic clock can be useful too, but it should start from correct point. Our code should work fine then, but would 3rd party work?
ArtemGr commented 5 years ago

Some distributed systems depend on clock time synchronization, while others use their own virtual clocks (1, 2, 3, ...) to determine the ordering of the distributed operations. The latter ones would see no breakage from a diverging clock time. I'm curious, do we use anything specific that requires clock time synchronization? Electron? Blockchains?

Anyhow, loading our own reliable time when mm2 starts is a nice idea. But we should probably track it as a separate issue. Specifically, incorrect clock time doesn't trigger the overflow panic with which we are concerned here (the time might be wrong but monotonic). And we can have a correct clock time (withing the reasonable +1/-1 minute marging) on a phone and yet still see this bug triggered due to kernel changing the time in a non-monotonic fashion.

Our code should work fine then, but would 3rd party work?

I don't think we have a lot of choice here apart from removing such failing dependencies or fixing them to use a monotonic time function or into being lax towards the time going backwards.