slub / ocrd_manager

frontend for ocrd_controller and adapter towards ocrd_kitodo
MIT License
11 stars 3 forks source link

SSH to Manager does not get the exit status so Kitodo.Production script does not run asynchronously #49

Closed markusweigelt closed 1 year ago

markusweigelt commented 1 year ago

Currently the exit status is not returned and therefore the script in Kitodo.Production does not run asynchronously anymore. Have checked several causes here. The changed alias of for_production script does not seem to play a role here.

When i remove the first pipe with the tee command it works fine. So i think detaching does no longer work correctly. https://github.com/slub/ocrd_manager/blob/main/process_images.sh#L101

Maybe we have to create a separate wrapping inner subshell with the tee command to stream to ocrd.log and the outer one has only the last command which detaches from the subshell at the top.

markusweigelt commented 1 year ago

@bertsky What do you think?

bertsky commented 1 year ago

When i remove the first pipe with the tee command it works fine. https://github.com/slub/ocrd_manager/blob/main/process_images.sh#L101

Very strange!

So i think detaching does no longer work correctly.

Yes, it would seem so. Very strange!

Maybe we have to create a separate wrapping inner subshell with the tee command to stream to ocrd.log and the outer one has only the last command which detaches from the subshell at the top.

No. Piped command sequences always count as one, esp. regarding the bg operator. It must be something else.

bertsky commented 1 year ago

Cannot reproduce:

ocrd-manager for_production.sh: ocr_exit in async mode - immediate termination of the script
...
KitodoActiveMQClient:61 - Sending of message to close taskId 26 successful
markusweigelt commented 1 year ago

Yes, this is also in my log but it still does not work.

You can reproduce this by writing a logger output to the Kitodo.Production shell script after the SSH call (https://github.com/slub/ocrd_kitodo/blob/ab2ba02cdb39397de9e1227f229dd2d24ffadebc/_resources/kitodo/data/scripts/script_ocr_process_dir.sh#L28). It will be logged after the OCR processing is finished.

In Kitodo.Production you can reproduce the behavior. The dialog does not close after the Kitodo script has been sent.

bertsky commented 1 year ago

Yes, this is also in my log but it still does not work.

I also got the immediate response on the Production website (kitodoScript finished alert in green).

The colour status of the process only changes after the OCR job completed, though,

You can reproduce this by writing a logger output to the Kitodo.Production shell script after the SSH call. It will be logged after the OCR processing is finished.

I just did that, just to be sure:

06.03.2023 19:55:56
Mar  6 18:55:55 725f38a8616c script_ocr_process_dir.sh: ssh destination 'ocrd@ocrd-manager' port '22' running command 'for_production.sh --proc-id 3 --task-id 26 --lang deu --script Fraktur /data/3'
06.03.2023 19:55:56
Mar  6 18:55:55 725f38a8616c script_ocr_process_dir.sh: ssh destination 'ocrd@ocrd-manager' port '22' finished command with 1

So it must be something on your system...

markusweigelt commented 1 year ago

So it must be something on your system...

Don't think our Docker or system is so different that it leads to such differences. We also use Docker to be as independent as possible from the host or a native installation.

I think it's the profiles. I always test with both Docker Compose profiles. You probably only test with the Kitodo.Production profile and an external controller, right?

If we don't have an equal test base here anything is possible. Will set up a CI test that can hopefully replicate the problem. Otherwise, I have no idea what the problem should be on my system.

bertsky commented 1 year ago

FTR: reason was that ocrd/core had been updated on Dockerhub – without a new release, without any public notification, warning about the Python 3.6→7 and Ubuntu 18→20 update. In that new base stage for the Manager, bash (v 4→5) behaved slightly different w.r.t. background jobs. Our different test outcomes resulted from whether or not the latest core images had been pulled already.