sloki-project / sloki

NodeJS in-memory documents database, based on lokijs
MIT License
3 stars 0 forks source link

progression status #1

Open botzill opened 5 years ago

botzill commented 5 years ago

Hi.

Sorry to open this as an issue, but I want to ask what is the status of the project and when do you think will be ready for some productions tests?

franck34 commented 5 years ago

Hi !

Thank you !

First ticket opened, you win a beer :D ! I'll keep this ticket open for people who want to ask the same question ;)

So, right now, it's "IN PROGRESS" as you can read in the README.md. Everything can change in the code.

I'm not able to give you a delay. But we are not speaking in months.

We are speaking of days/weeks. Because i NEED this layer on top of LokiJS, ASAP.

Quickly, what i have in mind :

  1. Right now, i need to separate "commands" in another npm package, so we can have a standalone client library. I'll start this ASAP. In a few hours. Now.

  2. In the same time, i'll probably rename this project, something like sloki (like Server for LokiJS), See next point to know why.

  3. Split actual code into 3 npm packages:

    • sloki for the server,
    • sloki-shared for common library between client and server
    • sloki-client-nodejs for a first client library (in a futur, we can imagine a golang client, python, whatever .., sloki-client-go, sloki-client-python ...)
  4. In term of compatibility with LokiJS features, all is easy to implement, EXCEPT functions which requires callbacks (filters ...), perhaps there is a limit in actual approach to be able to implement theses features.

  5. In need pubsub stuff to monitor changes (only TCP/TLS client will have this, or perhaps websocket for HTTP(s) clients). I need to go deeper in this

In background, when 3 is done, i'll try to speak with @techfort (creator of LokiJS) to heard what he think about all that.

PR are not suggested at this time, it's too early.

The goal is to have a fully implemented

Not sure about HTTP(s) Server and Client, because of performance reason. But well in fact it depend if people need it. Thanks Jayson module (JSONRPC), transport layers are not too hard to implement. The SPOF is callback (see step 4)

franck34 commented 5 years ago

@botzill

botzill commented 5 years ago

Hi @franck34! Thx for clarifications.

Yes, I'm a python and DevOps. The reason I'm asking about it is that we want to use this into our project(which will be hosted on kuberentes). Now I was thinking how can we achieve this and basically we need the layer that you want to implement.

Can you also elaborate a little on why you need this and what is your use case? Do you also need to make a cluster?

Thx and good luck!

btzsoft commented 5 years ago

Hi @franck34

It's amazing what are you working on. sloki 💯 🍺 🚀

I'm the JS guy who works with Loki. We with @botzill are working on a platform where we want to handle hundreds of thousands of query search operations per second within the Loki database.

As @botzill says we want to scale Loki using Kubernates.

Can I ask you if you have some benchmarks? It will be possible to scale to multiple instances?

Thank you!

franck34 commented 5 years ago

Hi @btzsoft !

My problem is horizontal scalability in a first time. This is why i work on this project right now.

Vertical scalability is another problem, but it's not mine in a first time. I think we can speak about it if and only if the horizontal scalability of LokiJS problem is solved. Because it imply sync between multiple LokiJS Server instance : atomic, realtime ... it's another level and i'm not the good one to think of that. It's like implementing redis clustering. Wow. Let's see that later.

Can I ask you if you have some benchmarks?

I don't have yet. As you can see in the README.md, benchmark is marked as a TODO. But the code need to be mature enought before doing some benchmarks. It's not the case yet. Maturity is not here. See "my mind" in the thread of this ticket to understand what actualy happening in "my mind" (lol, i feel alone :D )

image

It will be possible to scale to multiple instances?

I can be wrong, but perhaps you are speaking about vertical scalability. The idea is LokiJS-Server (or sloki) is what you can read in the README.

My goal is to be able to PM2 cluster scale processes needing LokiJS data, without having LokyJS DB for all processes in memory and fight with sync between local/remote instances. (I'm so sorry for my bad english).

One LokiJS-Server instance (docker instance, lot's of memory/CPU), and many clients (nodejs in a first time). In other words, web applications (horizontaly/verticaly scalable) should be clients of one LokyJS Server, in a first time, again. But with pubsub other TCP/TLS, every instances of the web application server should be inform about changes in Loki DB, so it's possible to make all instances of the same web app to be inform about changes.

                                          JSONRPC (jayson)
                                         TCP|TLS

        +----------------------------+                         +----------------------------------+
        |                            |                         |          LokiJS-Server           |
        |       NodeJS Daemon        |<----------------------->|        (Local or Remote)         |
        |                            |                         |                                  |
        +----------------------------+                         |    +------------------------+    |
                                                               |    |                        |    |
        +----------------------------+                         |    |                        |    |
        |                            |                         |    |                        |    |
        |       NodeJS Daemon        |<----------------------->|    |         LokiJS         |    |
        |                            |                         |    |       (database)       |    |
        +----------------------------+                         |    |                        |    |
                                                               |    |                        |    |
        +----------------------------+                         |    +------------------------+    |
        |                            |                         |                                  |
        |           CLI              |<----------------------->|                                  |
        |                            |                         |                                  |
        +----------------------------+                         +----------------------------------+

In this ascii chart, you can see multiple nodejs daemon (i.e PM2 scale or whatever) and only one LokiJS-Server instance.

Perhaps i'm not clear with my english, but it's clear in my head.

Every help is welcome after step 3 (see below).

franck34 commented 5 years ago

@botzill

Can you also elaborate a little on why you need this and what is your use case?

I need super fast in memory DB because i'm working since 20 years in Web Applications Firewall (security) and i got many idea with a in memory DB like LokiJS. Redis is a good key/value DB, with TTL, but no native SSL/TLS, and without advanced query. MongoDB/elastik/rethinkdb/... got SSL/TLS, but no key/value with TTL. Yes, lokyJS don't have key/value/TTL stuff right now, but ... so easy to implement in LokiJS that i'll probably PR. I like JSON files i can read, i like FS adapters implemented in Loki, i like the concept in general and i'm SURE that in memory is the fastest way to handle plenty of needs (on my side).

Do you also need to make a cluster?

Definitively YES.

franck34 commented 5 years ago

Starting refactor.

New URL will be https://github.com/franck34/sloki

franck34 commented 5 years ago

Hi guys

Next step : implement insert/get and make some benchmark

franck34 commented 5 years ago

Btw i removed HTTP transport for moment. Not sure it make sense for performance reason. Only websocket API make sense,

techfort commented 5 years ago

Not a constructive comment to the discussion but I just want to say I really love where this is going. Server-side was my initial vision of LokiJS, even though it became a lot more popular for in-browser and mobile usage. Kudos.

franck34 commented 5 years ago

Hello @techfort ! I'm so happy you love it :) You made a really great job with lokijs, i hope Sloki will live up to Lokijs. I dreamed about kind of slokiMyAdmin this night lol.

franck34 commented 5 years ago

insert and get implemented. Next steps is benchmarks (sloki vs lokijs vs redis vs mongo) :

botzill commented 5 years ago

Hi @franck34 great work!

I just want to let you know that we will test this as well soon. Let us know when you ready and we will test it as well. Plus, I want to deploy this on k8s so, I may also create a simple helm chart to test it.

Thx again and let's make this a great tool.

franck34 commented 5 years ago

@botzill okey ! on your side, can you prepare functions list you are actually using lokijs side ? (i.e insert/get/find etc)

btzsoft commented 5 years ago

Hi @franck34

I apologize for the delay, I was so busy to open any additional browser tabs. vscode was my browser :)

Finally, I've got what's in your mind, it's amazing, I think @botzill found you in the right moment at the right github repo 🚀

I'll describe in a few words what we are working on and where is the Sloki on the map. So we have a desktop electron app where all computers are connected using WebSocket to the socketcluster cluster. On the other side, it's a broker which find devices and will send data between them, so we need something very performant for querying and filtering those devices using a lot of filters, ex. get devices from X country, sort by X, with X CPU, with X RAM, etc. Every device has his own entry in loki. We don't need smth like redis elasticsearch postgresql etc. because the data persistence of devices state it's not our goal. If something wrong will happen, losing the devices state will be recoverable at reconnection. We just want very fast querying.

For us, the goal is to scale the broker because after filtering will be some operations which at some point will require scaling. Having a Sloki server with ~200GB RAM will be very enough :)

I have right now some priority tasks for a couple of days, but after that, I will continue the development part where we use Loki, so I'm very excited to start with adapting with Sloki.

Thank you for your effort! I'll keep you in touch.

Sloki 🚀 🥇 🔍 ⏩

franck34 commented 5 years ago

Thank you @btzsoft

Do you have any idea of Transaction (or Operation) Per Seconds you need (maximum) ? I mean, in your actual implementation with LokiJS. Are we speaking about 10 or 100 or 1000 or 10000 operations per second ? On the same machine (tcp://localhost), if you are lower than 5000 operation per second, the actual implementation should be OK.

Is it possible for you to split TPS or OPS into lokijs calls (i.e insert a document, update, delete, get, find, filter ...)

I've started some benchs and i'm disapointed (for moment only, i mean ... perhaps my tests are not revelants). I'm playing with 500 000 inserts. Not sure that jayson layer (JSONRPC) will stay in sloki if perf stay as "bad" as now (i.e 50% lower than redis in a basic INSERT test)

franck34 commented 5 years ago

Invitation to join gitter chat to continue this conversation rather than in a ticket

https://gitter.im/sloki-server/community

btzsoft commented 5 years ago

Hi @franck34

For now, we are ok with 5k/s but in the next months, we will need to increase this to a much higher number of devices, about 50k.

Currently, we have just only 20 devices, that's roughly (1 device =~ 10 OPS) * 20 devices = 200 OPS. So, we use insert when the device it's connected, delete when it's disconnected, this happens rarely. For rest operations, we use find and update.

I will join right now in the gitter and continue this discussion in it.

Thanks!

franck34 commented 5 years ago

@btzsoft Thanks.

With some optimisations (uuid/v4 was really slow in the tcp client, i've switched to https://github.com/mcollina/hyperid ), it become interesting.

I'm now 50% lower than redis. Before this little optimisation, i was 90% lowest than redis (!)

I'm speaking about INSERT 500 000 items in parallel using only one TCP client, on localhost.

Let's say right now sloki support 12K OPS.

Using standalone lokijs, it's something like 200K ops/sec (!!), so the lost of performance is related to JSONRPC layer (jayson), then TCP compression (none actualy, need to patch jayson), then datasize (actualy, JSONRPC add let's say 20 chars per requests and we can reduce that to 3 chars.)

In a few hours, i'll update release/update packages

For your needs, i need to implement delete/find/update, question of hours. I can release that this weekend, perhaps you can start benchs on your side next week.

See you on gitter ;)

franck34 commented 5 years ago

Both server and client 0.0.6 published. Client now support promises AND callbacks

franck34 commented 5 years ago

Both server and client 0.0.9 published.

Protocols supported:

Transports:

Lot's of work regarding performance. See https://github.com/sloki-project/sloki#benchmarks insert {foo:1} to compare different protocol/transport performance. I tried to do my best.

Next step:

Not usable yet.

franck34 commented 5 years ago

Docker image in progress