Open giancarlobi opened 3 years ago
Hi, I have no internet but a few things I can see from my phone: delta > 3s : process stopped or hang This is not accurate. There won't be any pings while an HOCR is processing. If the process is an image it can take at least 60 seconds. Loops try to run on the intervals we give them but will depend (when not doing a fork/child process) on how much each process takes on returning.
So you will have to be more conservative about the time to have a reliable chart of we will end killing things we need. yes, we should remove the 720 seconds and maybe give each queue more than 60 seconds (per queue worker) to deal with things.
@DiegoPino I agree and I have to work on that. In the meantime I made some changes and added more logic to run, checkrunning and form of hydroponics process. All in a branch on my git here: https://github.com/giancarlobi/strawberryfield/tree/ISSUE-133. This is working and can reset from a condition when you kill process from command line. A lot of more checks and debug I need, probably tomorrow. No rush, so take a look when free and .. connected.
@DiegoPino I think I have to analyze the way to use react PHP child process as you already point to me. In my first test about react php I already tested it here: https://github.com/giancarlobi/strawberry_runners/blob/1ce8855d792b030535fae0cca5158298ebd94817/src/Scripts/mainLoop.php#L213. Probably this will be for a next release of archipelago, not sure I'll be able to solve before RC1.
So you will have to be more conservative about the time to have a reliable chart
@DiegoPino I changed code to be more conservative. I assume heartbeat alive if less or equal to 60s + 5s . Also updated error messages and stop/reset to be more conservative. Also idle timer changed from 60s to 80s. The full code on the same branch as above: https://github.com/giancarlobi/strawberryfield/tree/ISSUE-133 Feel free to check when you have sometime, no rush.
@DiegoPino this is related to this PR https://github.com/esmero/strawberryfield/pull/138
@DiegoPino about this: "So this one archipelago:hydroponics and one named archipelago:hydroponics-multiprocess..." is more simple a single command with default 1 process x queue and if we want multi process we have to add a parameter with number of processes ?
@giancarlobi the reason I want to allow a few options there is because I see in the future a need for different ways of processing and having alternatives would allow other modules to override it. A single Big service is good but a small one, simple and a bigger one feels more sustainable since we will have for sure many small institutions using this that may want to have less moving pieces (or even invisible ones).
1 process x queue == is good for your new service. The old one is one process for all the queue, and to be honest is also good for small repositories because its very simple to debug and manage
1 process x queue == is good for your new service. The old one is one process for all the queue, and to be honest is also good for small repositories because its very simple to debug and manage
@DiegoPino I understand. So the idea could be: the current archipelago:hydroponics is fixed to 1 process x queue while we then create a new one archipelago:hydroponics-multiprocess where we pass number of proc x queue by parameter or something like this. Right? When you wrote "is good for your new service" do you refer to current hydroponics implemented in my PR ?
Or do you prefer that archipelago:hydroponics will be 1 process for all queue? sorry, it's not clear to me.
@DiegoPino to make (mainly for me) ideas more clear along all chain process called, in this PR the chain is: 1) main loop DRUSH archipelago:hydroponics, started from here https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/StrawberryfieldHydroponicsService.php#L211 and executes this code https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/Commands/HydroponicsDrushCommands.php#L39 2) For every queue and for each element (one at time) call a child process (in background) by DRUSH archipelago:hydroqueue here https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/Commands/HydroponicsDrushCommands.php#L129 and executes this https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/Commands/HydroponicsQueueProcessDrushCommands.php#L33 3) The above drush command calls this function https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/StrawberryfieldHydroponicsService.php#L107 4) The function above calls the correct queue worker here https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/StrawberryfieldHydroponicsService.php#L124 5) Finally the queue worker processes data and return
In case of multiprocessing, multiple (2) DRUSH archipelago:hydroqueue are called.
@giancarlobi I think the current (RC1) hydroponics drush is OK for the small institutions (with small corrections). No Childs and simply devoting partial time to each queue. For the new one (yours) your code is perfect, but I want to check/ask you something first...
- For every queue and for each element (one at time) call a child process (in background) by DRUSH archipelago:hydroqueue here
Is this correct? Each queue can run during its Process more than a single Item. Can we leave the queue to decide how many it runs and only make the Process be one per queue? For what I see you are already making it like that: https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/Commands/HydroponicsQueueProcessDrushCommands.php#L43-L45. You are letting the queue decide how much it can do in 60 seconds right? So its about time, not number of elements and in this case you need to change the message because we know how many items are left in the return.
https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/StrawberryfieldHydroponicsService.php#L177 And we can ask how many are before too.
- 4 and 5.
Perfect. But I still think the process needs to be clearly stated as 1 per queue (the code does that) for 60 seconds each time. And that number (60) can be a setting
You are letting the queue decide how much it can do in 60 seconds right? So its about time, not number of elements and in this case you need to change the message because we know how many items are left in the return.
@DiegoPino Not sure about the question, but I think I process only one item at time because I'm calling processQueue($queue, 60, TRUE); where the last parameters (I added to function) is telling single element so in the function we go here https://github.com/esmero/strawberryfield/blob/8ac5cca8f28dce19722f2742bc91ab06455d5b04/src/StrawberryfieldHydroponicsService.php#L146 where we claim only one element and time is use as lease time, if I correctly understood, it means if we don't delete that item from the queue before lease time the item is requeued. Sorry if I'm not clear or I didn't understand your question ... here near sleeping time so ... a really few neurons alive .. take care friend
@DiegoPino What about a setting in hydroponics conf page where user is able to select the type of processing:
@giancarlobi yes! That is what I would like. For security reasons the last one needs to be set in the settings.php or provided by a module. If not people (bad admins) could make a mess. About multi processor. It conflicts a little bit the fact that is processing a single item per process. Do you think we could better deal with this as time based? Why? Because starting and closing a drush script as a child takes time and resources (a full bootstrap).
So let me now test your work. I have only 3 live archipelagos left to update (did 4 last night).
@DiegoPino to help my memory, as discussed in the call, next steps will be:
After this, we can check deeper exceptions and timeout and more.
Feel free friend to edit anything here, please.
Perfect friend! Exactly that.
@DiegoPino some notes about pcntl. Reading this https://stackoverflow.com/questions/17050799/how-to-check-pcntl-module-exists is what happens with my server. From code extension_loaded('pcntl') return false but from CLI function_exists("pcntl_fork") return true. In addition I saw this https://github.com/reactphp/event-loop/pull/195 . Also, on my archipelago react-php child processes seems to work so ... not clear to me what we need or not and how we check
@DiegoPino About your post "To think about: some queues need to run in sequence. Some queues may allow parallel processing." I was thinking about this. The starting point is that in the queue the processes are inserted in order of execution/dependency. What if we assign to each process (child?) a level starting from 0 then 1, 2, ...
So the idea is:
What do you think? Is this something you thinking about?
@giancarlobi I like the idea, but I feel its too complex. The service is already quite complex. I feel its easier we simply mark "queues" completely as "able to run in parallel" or sequential. We already do a lot of logic on Queue assignment and calculating which element needs which one. Actually sequential workers are the ones in charge of creating their own children. Bringing all that logic into the process manager feels like doing it twice and adding too much complexity
What I would suggest is to add another queue/queue worker, one for "sequential heavy processing".
Basically leaving ONLY the one that keeps OCR right now as "parallel-izeable" all the rest needs to run in order one item at the time.
How? I will think about it. Probably an annotation of a public method like ->allowsParallel() and return false for each one except the one we have for the OCR
@giancarlobi I like the idea, but I feel its too complex. The service is already quite complex. I feel its easier we simply mark "queues" completely as "able to run in parallel" or sequential. We already do a lot of logic on Queue assignment and calculating which element needs which one. Actually sequential workers are the ones in charge of creating their own children. Bringing all that logic into the process manager feels like doing it twice and adding too much complexity
What I would suggest is to add another queue/queue worker, one for "sequential heavy processing".
Basically leaving ONLY the one that keeps OCR right now as "parallel-izeable" all the rest needs to run in order one item at the time.
How? I will think about it. Probably an annotation of a public method like ->allowsParallel() and return false for each one except the one we have for the OCR
@DiegoPino I agree to avoid complex code. The only difference between a parallel-izeable queue and the other ones is that in the first case we can have more than 1 process running per queue while for second one we limit a 1 process per queue, is this right?
Yes. Only difference i guess. If an item fails we put it back 2 times and give up (queue worker does that)
El El dom, 17 de ene. de 2021 a la(s) 12:04, Giancarlo < notifications@github.com> escribió:
@giancarlobi https://github.com/giancarlobi I like the idea, but I feel its too complex. The service is already quite complex. I feel its easier we simply mark "queues" completely as "able to run in parallel" or sequential. We already do a lot of logic on Queue assignment and calculating which element needs which one. Actually sequential workers are the ones in charge of creating their own children. Bringing all that logic into the process manager feels like doing it twice and adding too much complexity
What I would suggest is to add another queue/queue worker, one for "sequential heavy processing".
Basically leaving ONLY the one that keeps OCR right now as "parallel-izeable" all the rest needs to run in order one item at the time.
How? I will think about it. Probably an annotation of a public method like ->allowsParallel() and return false for each one except the one we have for the OCR
@DiegoPino https://github.com/DiegoPino I agree to avoid complex code. The only difference between a parallel-izeable queue and the other ones is that in the first case we can have more than 1 process running per queue while for second one we limit a 1 process per queue, is this right?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/esmero/strawberryfield/issues/133#issuecomment-761844563, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU7ZZ5T4YRBXBLDCEVTGRLS2MKANANCNFSM4VZEN7CQ .
-- Diego Pino Navarro Digital Repositories Developer Metropolitan New York Library Council (METRO)
@DiegoPino Following the variables and conditions I analyzed to manage hydroponics process. I'll try to edit code (and made a PR if I'll reach some good point) based on:
Variables:
intval(\Drupal::state()->get('hydroponics.heartbeat')); every 3s heartbeat timer updates stored value delta = ($currentTime - $lastRunTime) Values: delta < 3s : process alive delta > 3s : process stopped or hang
(int) \Drupal::state()->get('hydroponics.queurunner_last_pid', 0); PID of hydroponics process. Values: 0 : never run N>0 : run or running process pid (not imply process is running) N<0 : idle timer executed (not imply process stopped, i.e. software hangs) was running with pid=-N
posix_kill($queuerunner_pid, 0); Values: TRUE : process with $queuerunner_pid pid is running FALSE : no process with $queuerunner_pid pid is running
Timers:
[heartbeat] every 3s updates 'hydroponics.heartbeat'
[queue] every 1s checks queue. A timer for each active queue.
[idle] every 60s checks if all queues were empties for 3 cycles. If so then: set pid = -pid cancel timers cancel its self (implies process will be closed)
Conditions: