bisq-network / projects

@bisq-network project management
https://bisq.wiki/Project_management
9 stars 2 forks source link

Migrate datastores to real database #29

Closed freimair closed 3 years ago

freimair commented 4 years ago

_This is a Bisq Network project. Please familiarize yourself with the project management process._

Description

This project is about replacing the storage infrastructure with a real database we can query using SQL.

Rationale

Why is it important?

Until now, data stores have been persisted by serializing data to Protobuffer and then writing that to disk. Because of that and as the data stores grow, we see

This project is about replacing the storage infrastructure with a real database we can query using SQL.

Why should it be done now?

Needs to be streamlined and fixed before growth is hugely successful.

Aside from that it will address the following issues in one way or the other

probably

probably more

What will happen if we don't do it or delay doing it?

if we do not do it, CPU, RAM and IO issues will grow until Bisq no longer works. If we delay it, the user experience is affected even more.

Criteria for delivery

Measures of success

Risks

As always with changes in the P2P part, the risks are substantial.

Tasks

Estimates

Task Amount [USD]
create test suit 1900,00
select db technology 200,00
proof-of-concept implementation ~900,00~ 3400,00
make production ready ~2300,00~ 3800,00
testing ~700,00~ 5400,00
other 500,00
total ~6500,00~ 15200,00

EDIT: adjusted the numbers by factoring in recent developments and difficulties.

Notes

followup to https://github.com/bisq-network/projects/issues/25

wiz commented 4 years ago

As some background information, in case the reason why this is a critical bugfix is not directly obvious, is because recently we are seeing support cases from users with either invalid / corrupted payout TX, or more recently, the payout TX being deleted or lost by bisq internally somehow. We believe it is due to the datastore issues

freimair commented 4 years ago

well, new to me, but bring it on. the more bugs we can fight off with this the better.

freimair commented 4 years ago

@cbeams, @ripcurlx: Its been 3 weeks and we have 3 :+1: and I believe a decision is due. What is there to wait for?

Additionally, just to make it clearer, I would aim to allow for different SQL-compatible backends. Especially,

ripcurlx commented 4 years ago

@freimair Do you see any additional security risk if we are using a DB based storage?

freimair commented 4 years ago

@freimair Do you see any additional security risk if we are using a DB based storage?

chimp1984 commented 4 years ago

@freimair Do you see any additional security risk if we are using a DB based storage?

Any DB solution will add a huge dependency tree to bisq. So that is definitely a security risk. We should look which DB solutions are used in other Bitcoin projects (like Bitcoin Core, LN,....) - though they might have very different requirements.

ghost commented 4 years ago

It is being claimed that using a database will fix 6 critical bugs to do with high memory usage/exhaustion. From the little experimentation I and others have done it seems that JavaFX is the culprit when it comes to the memory use/leak issues. I don't see how using a database will solve those issues.

freimair commented 4 years ago

It is being claimed that using a database will fix 6 critical bugs to do with high memory usage/exhaustion.

That is not entirely correct. The claim is "will address the following issues in one way or the other". It will not fix these issues once and for all, however, it will contribute to lower resource requirements.

And yes, JavaFX is a major player when it comes to memory - needs fixing as well.

@chimp1984 agreed. However, if we do not use Hibernate or alike, we might just keep the dependencies to a minimum. Needs to be thoroughly evaluated of course.

wiz commented 4 years ago

I agree that we should avoid new deps for the security risk. Bitcoin uses BDB and it would be perfect for us too if there is a Java equivalent.

chimp1984 commented 4 years ago

Migrating to any DB solution will be a pretty large task. I recommend to take the more low hanging fruits first, specially as those might remove many of the main reasons for the need of a DB. E.g. We don't need to keep 15 MB of trade statistics (goes back to 4 years old data). Loading historical data only on demand would reduce that issue by 95%.

DB will not solve magically all problems. If you do not keep tyhe data in RAM you will move to a async model which can become very complex when it comes to the DAO handling of the DaoState, which will become a major issue as it is alreayd 75 MB. But solving that is a pretty difficult and high risk task. Maybe a DB will help but maybe it will require a asynchronous (multithreaded) setup which adds a lot of risks and complexity....

Just rough thoughts, I am not against it, just want to keep expectations in reasonable limits and point out possible problems.

Also what was discussed above regarding lost payout tx data: That is very likely not related at all, as pending trades data are small. It might be that we do not update/write the data at the moment when the payout tx is set, and if the user shut down quickly or has a crash the data might be lost. Is likely a one-liner to fix if one investigate (maybe I will do if I find time).

chimp1984 commented 4 years ago

Just looked at trade persistence: it is persisted with a 50 ms delay. Does not sound like the issue but as the write is happing in a new thread the real delay can be significantly longer if high CPU load happens.

freimair commented 4 years ago

@chimp1984 please keep in mind that not everyone has a powerful dev machine to use and trade on. The current storage solution reaches is limits now and our requirements grow faster as technology can keep up.

This project is a massive one, that I agree on. High risk and everything. However, it also addresses a lot of issue we see regularly nowadays. We should at least get started. One step at a time.

chimp1984 commented 3 years ago

@ripcurlx @cbeams Can we close that project? We have rewritten and improved persistence and as we only read once at startup there is no use for a database solution. Also it would add a huge dependeny tail and with it increase the security attack surface.

cbeams commented 3 years ago

I did not track this project, but given the inactivity and your comment above, I'll err on the side of closing it as rejected.