Open corporatepiyush opened 5 years ago
This kind of improvement would really help with scaleability, a widely known challenge with Meteor apps; I've opened another issue along similar lines: https://github.com/meteor/meteor/issues/10677
please reopen this... stale bot is a pain...
Hopefully putting the issue in a milestone will keep it from being marked as stale…
Any news about this feature ?
👍
Hi @nathanschwarz I believe the first discussion here would be which part of Meteor we should try this feature first.
Meteor is a huge code base so we would need to use Worker Threads in baby steps.
Maybe a good criteria would be which part could benefit most of this feature.
@filipenevola Fully agree. I believe that Meteor is one of the best platform for Microservices and it would be good to go start it from the multi-core supply such as Worker thread. meteorhack:cluster package's strategy was good and working. But I think this is treated as a base support because of its important.
@kadasais meteorhack:cluster is based on the native node js cluster
module I believe. It's basically forking (multi-core but single threaded) not thread workers.
meteorhack:cluster also works on the client using web workers.
I've finished implementing multi-core on the server using cluster
a few days ago: It's quite straight forward.
The main downside right now is that we can't start Meteor as a serverless process.
But you can still use a single port to communicate between the processes when forking.
I've also made a serverless fork of Meteor using an environment flag to avoid the http server to start-up.
I'm using 2 types of multi core processes right now which are backed with a simple mongodb job queue :
I'm planning to make a third one for automated DB backups on an external ftp server.
@filipenevola obviously for me the best place to start would be on the server.
Multi core on the server would be "relatively simple" to build with the cluster
module.
We could also leverage the native worker_threads
module (with shared memory built in) but it's less straightforward because it would require to include the worker code into the build phase and replace the filepath in the master.
we would need new Worker('...worker_path.js')
to become new Worker('...worker_path_after_build.js')
@nathanschwarz Could you explain a bit more for 'Can't start Meteor as a serverless process'? I think the forking is enough to build a server using multi-core, and the key is working independently on specific service. Could you explain a bit more for your comment? What should we do or your suggestions things. Thanks-
Forking should be enough for a start.
Well depending on your usage and the implementation the workers dont need to be built with an http server, right now forking over a meteor process starts an http server each time (because meteor starts an http server). With my implementation I don't need the children to communicate together (so its basically a waste of ressources). that's what I ment by "serverless process".
Concerning the how we should build multi-core, I think the best way to go is by implementing a worker pool :
taskType: String
, data: Object
, priority: Number
, onGoing: Boolean
, createdAt: Date
.taskType
, priority
, onGoing
, and createdAt
should be indexed.onGoing
is false
.Then you can have 2 types of routine :
either :
onGoing: false
sorted by priority
and createdAt
.TaskMap[taskType]
.findOneAndUpdate
to set onGoing: true
.process.exit(0)
.Or :
WORKING || IDLE
.IDLE
worker, it pulls a task with onGoing: false
sorted by priority
and createdAt
, updates it with onGoing: true
, sends it to the worker, sets its status to WORKING
.TaskMap[taskType]
.IDLE
.The first is faster to implement but because of meteor heavy startup routine it will be slower and it will take more ressources.
@kakadais @filipenevola I made a working Worker Pool package here if you want to look at it. I can eventually put a PR together if you want. I still think the package needs some tweeks for modularity, logs, and tests anyway.
you can now directly add the package from atmosphere: meteor add nschwarz:cluster
Hi @nathanschwarz, sorry for the delay, I was on vacation, I just read your code and it looks great.
What do you mean by a PR? Do you need any changes in the core?
Are you using your package in production already?
We could promote it in the Meteor community and start to have usage in production if that is not the case yet.
@filipenevola obviously for me the best place to start would be on the server.
I was thinking about Meteor core features and not between server and client.
What features of Meteor core runtime or builder could benefit most for Multi core support?
For example, I was talking with @renanccastro that maybe we could take advantage of that on tree-shaking build analysis to analyze the sub-trees in different cores.
@filipenevola, no worries about the delay !
Yes I'm using it in production, it's fully working, it still lacks a few minor tweeks :
but we could had these incrementally.
No, there's no change to add to the Core as it is now. I was talking about a minor change to the starting behavior on the core:
The main downside right now is that we can't start Meteor as a serverless process.
since this package doesn't requires the workers to communicate together we could pass a flag to skip the http server to avoid the waste of ressources (It's a few lines of code in the core, but it's not that important).
I can make an abstraction of the worker pool if you wish for the tree-shaking feature since it's bound to mongoDB via the TaskQueue
right now.
@nathanschwarz great. Yes, these are nice to have but I believe your package is complete enough already. How could we promote it? A blog post in our official blog?
since this package doesn't requires the workers to communicate together we could pass a flag to skip the http server to avoid the waste of resources
I'm ok having this flag to avoid http server starting up, feel free to start a PR.
I can make an abstraction of the worker pool if you wish for the tree-shaking feature since it's bound to mongoDB via the TaskQueue right now.
I believe this is a good idea, we could have a mode
option in your package. In-memory and persistent (your actual version with MongoDB). Is that your idea?
@filipenevola great !
A blog post would be nice 👍 .
I'm ok having this flag to avoid http server starting up, feel free to start a PR.
I'll work on a PR soon, I will tag you when it's done.
we could have a
mode
option in your package
I was thinking adding an optional inMemory: Boolean
field to the TaskQueue.addTask
prototype (defaulted to false
).
This way you can have both persistant and in-memory jobs with the same Cluster instance !
I'll work on it ASAP.
@filipenevola I've just updated nschwarz:cluster
.
The in-memory jobs are working, and an inMemoryOnly
field is settable in the Cluster
options.
It should do the trick !
Tell me if you encounter any issue.
@filipenevola, my bad no PR required.
I thought that the cluster
module would provide a socket / filestream between the Master and the children for the IPC, but it's not the case, so the http server is still needed.
I've added eventListeners in 1.1.0
, so you can handle the results in the master process if needed.
I'll try to find a solution to get a random free port number for the IPC to avoid potential conflicts between multiple apps running / building at the same time.
Shared Nothing Architecture at OS level if we are considering Process instance instead of Worker Thread.
Hi @nathanschwarz, this is really great. Thank you.
Please ping me on Community Slack so we can work together in the blog post.
NodeJS v12 has the support for worker_threads with SharedArrayBuffer.
SharedArrayBuffer can be used for common data cache across the worker_threads to use all CPU cores of the machine instead of relying on Single Core CPU instance or Multiple docker instances/PODS always.
Ref Link :- https://nodejs.org/api/worker_threads.html#worker_threads_worker_threads