Closed jdhayhurst closed 4 months ago
Narrowed this down to the multiprocessing pool. It was configured with 2X CPUs, which is reasonable given that this is predominantly an IO bound process. It could be that the issue is with the queues that it uses. If the data are too big, there are pickling issues: https://bugs.python.org/issue8237. Either way, I tried the reducing the number of workers to the number of CPUs and that worked. Knowing that there are overheads with this approach, I also tested a multithreaded approach, but saw worse performance. This will be resolved with the merging of https://github.com/opentargets/platform-input-support/tree/3195_automate_running
Describe the bug During the OpenFDA step it logs that one (not always the same) of the zip files is being removed but hangs there indefinitely.
Observed behaviour Example logs:
It will remain like this indefinitely and never continue.
Expected behaviour The file should be removed and continue or if there is an error an exception should be raised.
To Reproduce Steps to reproduce the behaviour:
Set image and release versions
IMAGE_TAG="release_23-12" RELEASE_VERSION="devpis"
docker run -v /home/$USER/opentargets/output:/srv/output -v /home/$USER/opentargets/log:/usr/src/app/log -v /home/$USER/opentargets/credentials/open-targets-gac.json:/srv/credentials/open-targets-gac.json quay.io/opentargets/platform-input-support:$IMAGE_TAG -o /srv/output --log-level=DEBUG -gkey /srv/credentials/open-targets-gac.json -gb open-targets-pre-data-releases/$RELEASE_VERSION/input -steps openfda