ryan-self / exec-plan

A Node.js module to run child process commands synchronously.
MIT License
16 stars 1 forks source link

Good for Server / Stream Output? #3

Closed chriscantu closed 11 years ago

chriscantu commented 11 years ago

Hello Ryan,

I was doing some google-ing and found your library. Nice work!

I had two quick questions for you about your project.

  1. Would it be good for using on long running processes in a server environment like expressjs?
  2. Is it possible to stream the stdout of a command as it executes?
ryan-self commented 11 years ago

Hi Chris,

Thanks for checking out the library!

I'll start with your second question: "Is it possible to stream the stdout of a command as it executes?"

Yes, it is possible. By default, stdout and stderr are written to when process output is ready or when an error occurs, respectively. This is configurable if you want to make it silent. To see the configuration options, check out the "Exec Plan API > Configuration" section of this repo's homepage. At some point, I should start writing a wiki to make the documentation more accessible :)

Regarding your first question: "Would it be good for using on long running processes in a server environment like expressjs?"

This is a little bit more of an involved question. The short answer is "it depends", but that may evolve into a solid "yes" over the coming years. Before I explain, please keep in mind that I am not yet an expert on node.js performance in a production environment, and I have been working mainly with Java over the past year on my main production-level projects. My node.js projects have mainly been fun side projects thus far.

My library only uses the non-blocking API exposed by node.js for spawning child processes. This means that processes that you start up using my library will not block your main request / response thread that you are doing routing on with via expressjs. This allows you to take advantage of node's ability to scale the number of simultaneous users automatically, as long as you don't block the main request thread.

An orthogonal issue is related to your general question about whether it's good for long running process in a server environment. My current understanding of node is that there are side issues that come up when you run a lot of long-running processes. As I explained above, using my library, you won't block your main request thread; however, there may be memory concerns to take into account. Traditional web containers (like Tomcat, Apache, etc..) would keep a pool of threads available, and allow all requests to have their own threads, and if too many of those requests took seconds (rather than milliseconds) to complete, you would have a cascading problem where a large queue of requests would be waiting for those long-running requests to complete. With node, you don't allocate a thread per request, instead, you take advantage of the fact that all IO is non-blocking, so you only focus on serving requests on the "request thread", and delegate any major IO-bound operations to other processes. Getting back to the issue of "memory" I addressed above, you don't have the problem in node of needing to scale the number of simultaneous requests, but if a lot of the requests will start up these long-running processes, you have to deal with the fact that you may consume a lot of memory very fast, because nothing will stop the requests from coming in (by default), but each request will add memory footprints that may not be released in reasonable time windows relative to the number of long-running processes you start up.

Please double-check everything I've said, because I haven't recently looked into how memory issues are being addressed for long-running processes in node. The research I did from 2010-2012 showed that node is amazing for a very large number of requests that have runtimes in the millisecond range (as opposed to seconds).

With all of that said, please keep in mind that if you are not starting up long-running processes frequently, this shouldn't affect you, but if you are, you may need to consider another web container strategy other than node.js.

Please let me know if you have any further questions! :)

chriscantu commented 11 years ago

Hello Ryan,

Thank you for your detailed and thoughtful response. I too come from a Java background.

I am building a small build server for my team on nodejs based on the Jenkin's pipeline paradigm. We are using a CI/CD approach for our delivery of our projects. The primary reason is that the pipeline pluigin is not very stable and it seemed easier to write our own than fix the pipeline plugin.

I do not anticipate a large number of requests since it will only be my team using it (4 - 5 people at a time).

As for pipeing out the output, we don't want to pipe it to the console but rather to a websocket to be displayed on the webpage. I took a quick look at how you implemented the logging logic. I probably will try forking your project and add the ability to stream the stdout.

Thanks again for your response.

Chris

ryan-self commented 11 years ago

Thanks, Chris! I would be interested in merging your changes into this branch, because that sounds like a very useful feature (to support streaming to arbitrary streams) :)