Closed KesleyDavid closed 2 years ago
Hi Kesley, good evening at this side of the globe!
Using multiple CPUs is definitely possible using nodejs clustering or pm2, but you shouldn't have to if you add indexes. With an index (or multiple), you'll be able to get query results in milliseconds because it won't have to plow through all reords to check if they match your criteria. I'm happy to help you with any troubles you run into using multiple threads, but I think you should check out indexing first, Info about indexes can be found here. Let me know if you need more info, or use indexes already
Good morning, thanks for the information. Actually, the creation of indices would be the most suitable, but we are migrating from firebase to acebase, and due to time, I just "converted" the system searches to acebase. To create the indices, I will need to refactor all query queries, and change the way the data is grouped in the system.
I'm thinking about creating the cluster initially, for the system to work. And over time, I'm refactoring each code and adding indices to gain performance.
do you know what it could be?
I'm getting an error when creating the cluster with pm2, the prints follow:
module.exports = {
apps : [{
name: 'server-database',
script: 'server-database.js',
watch: true,
exec_mode: 'cluster',
instances: '2',
ignore_watch: ["node_modules"],
}],
};
const { AceBaseServer } = require('acebase-server');
const server = new AceBaseServer('dbREX', {
host: 'localhost',
port: 5757,
authentication: {
enabled: true,
allowUserSignup: false,
defaultAccessRule: 'auth',
}
});
server.on("ready", () => {
console.log("SERVER ready");
});
Code:
pm2 start ecosystem.config.js
LOGs:
CLIENT:
const ACEBASE = new AceBaseClient({ host: 'localhost', port: 5757, dbname: 'dbREX', https: false });
ACEBASE.ready(() => {
console.log('Connected successfully');
});
I did some additional research, you can't use pm2 for clustering at the moment. In a pm2 cluster there is no master process (pm2 itself is the master), so the IPC between the threads doesn't work. The different processes have to be able to "talk" to each other because they read and write to the same database file, and them accessing the data independently would cause corruption. You can use Node.js's native clustering functionality, but you'll have to fork the child processes yourself then.
I strongly encourage you to take a look at indexing first, I'm working on a solution for pm2.
thanks for the explanations. I started to create some indexes in the database, the performance gain was really high. However, on the dashboard screen for example, the entire server still crashes, and other users cannot access data, as the main process is being occupied by the cpu. the cluster with pm2 I believe would be the most suitable in the current situation
I understand your need for multiple processes, the only way to do that now is by bypassing pm2 and forking the process yourself. I'm working on an IPC implementation that will use an external server for communication between isolated (pm2) processes, but that obviously will take some time.
The very best you can do now is investigate what is causing that extreme load on your server process. Is your dashboard requesting too much data, too frequently? For example, if you are using value
events on large data trees, please keep in mind that every change anywhere in the tree will require the database to load all data being monitored in order to provide new/previous value pairs for the events to fire. For example, if you have a: db.ref('orders').on('value', callback)
, each tiny mutation to any order (such as db.ref('orders/order553/shipped').set(true)
) will trigger ALL orders to be loaded from db, create previous/new value pairs for, and sent over the network. For monitoring data changes in large collections it's better to use notify_value
, mutated
or mutations
events. Let me know if you think the load is caused by other/unexpected causes, I'm happy to investigate and help.
I've been working on the clustering functionality, it is now possible to create pm2 and cloud-based AceBaseServer clusters! Check out the new AceBase IPC server repository, its documentation describes in detail how to setup a cluster of AceBase servers. All is obviously brand spanking new and might have issues, would really appreciate if you'd want to help testing!
@appy-one Good morning, very good, and thank you for the initiative. I'm going to start testing, and I hope to be able to contribute to the project as well.
@appy-one, I am encountering some issues when testing the cluster. Even with these errors, I can consult the data in the database, but apparently the cluester has something wrong. Do you know what it could be? I created a repository with the entire environment for us to test
Repository: https://github.com/KesleyDavid/test-db-acebase-cluster
Error:
Ah, I see. I didn't test the websocket connection, assumed that would just work. (🙈) I googeled and apparently socket.io is not able to connect to a pm2 clustered server using long-polling. You could disable long-polling on the client so it'll only try connecting through websockets but then there might be other state-related issues. The websocket between client and server is mainly used for event notifications and that might become an issue in such cluster. I'll dive into it!
I get it now, it's really too much detail to get the database into a working cluster.
I'm also researching and doing some tests
I updated acebase-client (v1.10.1), it now only uses websocket transport to connect to the server (disabled long-polling). This should fix the connection issue, and as far as I can see will not have any negative side-effects.
A websocket connection from client to server will stay connected to the same server process so there should be no issues with state. Data updates and retrieval use regular http requests so they will be load balanced in your cluster; event notifications are sent over the websocket connection, and their subscriptions are state-managed within the AceBase cluster through IPC so this should also work ok.
Let me know if this fixes it!
thank you, i will perform the tests
Good night, the workload is being balanced evenly by request to the database. I'll run more tests over the weekend and update the results here.
The cluester apparently works perfectly, however I have a problem to create indices in the database, I run the command with an admin user of the database, but nothing happens and the index is not created.
I tested a table with several items and it didn't work. Then I tested it on a table with only one item, and even so the code stops when I try to add an index
Thanks, I'll test the indexing when running in a cluster
I published the indexing issue fix in acebase v1.12.3
Good morning, I'm implementing a system with the acebase database, but we're facing a problem where the system makes several general listing queries, and some of these collections have 26 thousand records.
We put it on a server with 10vCPU, however we observed that the database only uses 1vCPU at a time. We are using pm2. We tried to put it in cluster mode, but without success.
Is there any solution or alternative for us to take advantage of all the server's vCPU? Because with only 1vCPU, when we make two or three large simultaneous requests, the database crashes for a long time.