Closed coreation closed 10 years ago
I'm in favor of changing the {joburi}/run towards a command line command
So using an exec to jumpstart the EML? Basically the same result with threading, only now you'll have to pass parameters with it, make a new script file that interprets there parameters and starts up the Input:
e.g. joburi = win-events exec(php jumpstart.php win-events); return ??? ( => This code should probably be 200, but perhaps needs more documentation as we don't know if the job will finish correctly).
In jumpstart.php there should be something like (pseudocode).
<?php
fetchJob($jobname); input = new Input($job); input->execute();
I'm in favor of both (threading or exec) I just got excited when I saw a fresh project blooming providing PHP with threads, last year these libraries weren't there.
I was thinking about just using the job-uri as identification but not as a trigger anymore. In TDTInput I'd delete the Controllers and I would just allow for a command line script to be able to launch it.
In the end, we have always launched it over command line before. Running this job in the browser didn't make lot of sense in the past I believe.
Agreeing on the cli only part, just saying you have to pass the job-identifier, by passing his name with the script so that the script knows which job ETML he has to start. I disagree on removing the controllers, because how are you going to see what jobs are configured, and how? This also breaks with the notion of you GET what you PUT does it not?
Only the controller to execute the job I meant.
That's the same controller.... ;) We just delete the part that says "when you pass /run or /test we run the job." Thoughts on the job now putting it's logs after every chunk into a file using the logging directory configured in general.json ? I wouldn't pass the entire log to the CLI anymore, makes you wait untill everything is finished, making early error detection impossible and development cycles hell sometimes.
Yes ;)
Ok, this closes the discussion for now, I'll start implementing it on a different branch.
Present in the Blackwell branch, soon to be pushed to master.
Problem
Currenctly E(T)ML processes are being run in one go along with the job call. Great, but not scalable, for a cronjob will call the job uri to run it's EML from time to time and wait for it to finish, meaning the apache idle timeout has to be upped.
Solution
Threading ( yes, threading not asynchronous executes ). Async exec's are possible because threading in PHP were quite a low level DIY thing. However, https://github.com/krakjoe/pthreads has apparently the solution. So my solution is to make the Input class an extension from the Tread class and make it run as a thread.
In a next stage we should have some sort of "status" page, or just add status to the job where the status of the current job is held (e.g. running, sleeping, ....). Last but not least this links perfectly with https://github.com/tdt/input/issues/53 where the logging can now be done to a file (for every chunk for example) so that not only this logging permanently exists (or not, still open for debate) but is also user friendly (see issue 53 for more explanation.)