ebmdatalab / clinicaltrials-act-converter

1 stars 0 forks source link

Fix timeout warning in gae-ise workflow #16

Open madwort opened 2 years ago

madwort commented 2 years ago

As far as I can tell, this timeout appears in the logs in the Google Compute Engine Cloud Logging when nginx times out, but the gunicorn worker continues & runs to completion. Ideally there would not be error warnings in logs for expected conditions.

startup-script: Running webhook https://staging-fdaaa.ebmdatalab.net/management/process_data/?secret=...&input_csv=https://storage.googleapis.com/ebmdatalab/clinicaltrials/clinical_trials.csv

and

2022/08/10 16:09:53 [error] 28696#28696: *73788 upstream timed out (110: Connection timed out) while reading response header from upstream, client: ..., server: staging-fdaaa.ebmdatalab.net, request: "GET /management/process_data/?secret=...&input_csv=https://storage.googleapis.com/ebmdatalab/clinicaltrials/clinical_trials.csv HTTP/1.1", upstream: "http://unix:/tmp/gunicorn-fdaaa_staging.sock/management/process_data/?secret=...&input_csv=https://storage.googleapis.com/ebmdatalab/clinicaltrials/clinical_trials.csv", host: "staging-fdaaa.ebmdatalab.net"

As seen here:

https://github.com/ebmdatalab/clinicaltrials-act-converter/issues/15#issuecomment-1210789985

madwort commented 2 years ago

Nb. the gunicorn timeout is currently 100mins, and it typically takes something like 45mins/1hr/+ to complete, I don't think it's reasonable to let an https request wait that long for a response, we should use a different design. This will probably just go away if this ever gets rebuilt on a newer platform.