bisq-network / projects

@bisq-network project management
https://bisq.wiki/Project_management
9 stars 2 forks source link

Refactor send-message business logic in Connections #28

Closed freimair closed 3 years ago

freimair commented 4 years ago

_This is a Bisq Network project. Please familiarize yourself with the project management process._

Description

The business logic of sending messages needs some love. During https://github.com/bisq-network/bisq/pull/4047, it became apparent, that the logic might miss sending messages entirely. Main takeaways from the project:

Rationale

Why is it important?

IMO:

Why should it be done now? What will happen if we don't do it or delay doing it?

however,

Criteria for delivery

Measures of success

Risks

Tasks

Estimates

hard to say, as the project will only show its true face once we are knee-deep into it.

Task Amount [USD]
create test suit 1800,00
message queue 900,00
remove "external" scheduling 1200,00
testing 700,00
other 500,00
total 5100,00

Notes

chimp1984 commented 4 years ago

short term: maybe get rid of spurious message loss (during trade or mediation)

I doubt that it has a bigger impact on that. At shutdown of headless nodes there is/was no graceful shutdown, but they don't send crucial messages (trade, dispute). GUI clients do a graceful shutdown and the only case where it might be critical is when a crucial message was sent and the user immediately shuts down the app (or kills it hard). Even with a graceful shutdown there should be enough time to deliver the messages. The message queued up are usually not much (batching did not work as expected, and most of the time there is no batching).

freimair commented 4 years ago

Even with a graceful shutdown there should be enough time to deliver the messages.

actually, before https://github.com/bisq-network/bisq/pull/4047, even with "graceful shutdown", tor has been terminated before all messages have been flushed out. No RemoveOfferMsg, no CloseConnectionMsg. And, there is no central queue and there could be severe consequences to that. Here is the scenario:

  1. A critical message might be "queued" in a UserThread.runAfter(>0, connection.sendMessage(.))
  2. Thus, the connection does not know about the message yet
  3. The business logic triggering the UserThread.runAfter assumes it has been sent (and depending on the message, may also memorizes that information)

Now, given, the client gets shut down before the message is sent, the business logic has no way of knowing that the message hasn't been sent (so no resend). Thus, we have a "lost" message.

The message queued up are usually not much

Give it enough time and trials and it will happen.

At shutdown of headless nodes there is/was no graceful shutdown

The issue we have been/are facing here is that the data store files got corrupted frequently. A graceful shutdown did help some. However, #25 and #29 will complement this very issue.

chimp1984 commented 3 years ago

@ripcurlx @cbeams Can we close that project?

cbeams commented 3 years ago

Closing as rejected.