anyproto / any-sync-coordinator

Implementation of coordinator node from any-sync protocol
https://anytype.io
MIT License
22 stars 12 forks source link

loose coupling MongoDB #80

Open hellodword opened 3 weeks ago

hellodword commented 3 weeks ago

Have you read a contributing guide?

Clear and concise description of the problem

https://github.com/anyproto/any-sync-coordinator/blob/1aec721048911476eb07f96e5451b6dc2fdb9161/db/db.go#L19-L20

mongo is used in many parts of the project's source code:

https://github.com/search?q=repo%3Aanyproto%2Fany-sync-coordinator+%2Fmongo%2F+language%3AGo+NOT+path%3A%2F%5Edb%5C%2F%2F&type=code

Suggested solution

Same with the consensusnode It's much better in the consensusnode: https://github.com/anyproto/any-sync-consensusnode/blob/40bce1dbff2598befc37ad0c5a8fc16094f6a68d/db/db.go#L27-L40

I believe it is better to extend the interface and abstract mongo within the database layer. This would make it easier for both users and developers to add and use other databases.

For example, in my expected use case, which is a personal self-hosted environment, mongo is too heavy, it'll be great if we can add some lightweight or embedded database.

$ docker compose stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}"
NAME                                                 MEM USAGE
any-sync-dockercompose-netcheck-1                    19.05MiB
any-sync-dockercompose-any-sync-node-1-1             31.32MiB
any-sync-dockercompose-any-sync-filenode-1           43.5MiB
any-sync-dockercompose-any-sync-consensusnode-1      33.27MiB
any-sync-dockercompose-any-sync-coordinator-1        35.17MiB

any-sync-dockercompose-generateconfig-processing-1   20.54MiB
any-sync-dockercompose-generateconfig-anyconf-1      25.71MiB

any-sync-dockercompose-any-sync-node-3-1             31.77MiB
any-sync-dockercompose-any-sync-node-2-1             32MiB
any-sync-dockercompose-minio-1                       183.1MiB
any-sync-dockercompose-mongo-1-1                     308.5MiB
any-sync-dockercompose-redis-1                       26.3MiB

Alternative

No response

Additional context

No response

hellodword commented 2 weeks ago

I noticed that anyproto is working on a MongoDB alternative: https://github.com/anyproto/any-store

Any Store is a document-oriented database with a MongoDB-like query language but uses JSON instead of BSON. It is built on top of SQLite and fastjson. The database supports transactions and indexes.

I'm not sure about the specific goals, but I think we can take a look at FerretDB. It's a MongoDB alternative that uses PostgreSQL or SQLite as a database engine and is licensed under Apache-2.0. Additionally, there is lungo, a MongoDB-compatible embeddable database.

FerretDB is also embeddable https://github.com/FerretDB/embedded-example

cheggaaa commented 2 weeks ago

https://github.com/anyproto/any-store is not a replacement MongoDB, it's planned to be mostly client-side db

hellodword commented 1 week ago

https://github.com/anyproto/any-store is not a replacement MongoDB, it's planned to be mostly client-side db

Cool, what do you think about https://github.com/FerretDB/embedded-example

It's embeddable, built on top of SQLite, MIT license, well tested, and it has stable APIs for more than a year https://github.com/FerretDB/FerretDB/releases/tag/v1.0.0

almereyda commented 1 week ago

I was thinking of test-driving FerretDB as drop-in replacement of MongoDB in the full Compose setup. Will report back any outcomes of this experiment.

hellodword commented 1 week ago

I was thinking of test-driving FerretDB as drop-in replacement of MongoDB in the full Compose setup. Will report back any outcomes of this experiment.

Actually I tried it with any-sync*, some mongo APIs were missing. But I hope I played it in a wrong way. Good luck!

And the lungo works great: https://github.com/hellodword/anytype-all/blob/master/patches/any-sync-coordinator-v0.3.25.patch

cheggaaa commented 1 week ago

https://github.com/anyproto/any-store is not a replacement MongoDB, it's planned to be mostly client-side db

Cool, what do you think about https://github.com/FerretDB/embedded-example

It's embeddable, built on top of SQLite, MIT license, well tested, and it has stable APIs for more than a year https://github.com/FerretDB/FerretDB/releases/tag/v1.0.0

FerretDB seems like a good choice if your server-side is mostly based on PostgreSQL but you need to adopt some parts that require MongoDB. Consensus and Coordinator nodes were written and tested with MongoDB as the database. The consensus node uses MongoDB replication and some atomic features, and it has been well-tested, including with fuzzy tests.

In my opinion, the best solution is to write a single self-hosted node as a separate project, without dependencies like S3 and MongoDB. For personal usage, strong replication or scalability is not necessary; what's important is small resource consumption and simplicity. This isn't planned yet, but we're considering it, something like an all-in-one "light node".

Regarding any-store versus FerretDB on the client-side: Any-store is not an embedded MongoDB; it just has a similar query language. It supports custom sorting and filtering needed for Anytype, and offers more powerful transactions. It's faster because it doesn't use a heavy chain with mongo-driver -> ferret converter -> golang SQL driver -> sqlite3. Above all, it's about customization. We need fast and reliable storage that can work with many features of Anytype. Just MongoDB functionality is not enough.

hellodword commented 1 week ago

all-in-one "light node"

Love this idea! Please let me know if you have that plan. I was also thinking about this, that's the source of this issue. Currently the default any-sync* repos are a bit too heavy for personal self-hosting.

Any-store is not an embedded MongoDB it's about customization

Got that, thanks.