iris-hep / idap-200gbps-atlas

benchmarking throughput with PHYSLITE
6 stars 1 forks source link

Submitting 200 queries does not work #144

Open gordonwatts opened 2 months ago

gordonwatts commented 2 months ago

A full submission of all datasets (>200) causes ServiceX to hang. I've never been able to make a submission of all queries.

I think what is going on is the time it takes to take one of the cached rucio queires for the large datasets (say the 64K file one) shuts down the servicex app for an extended period of time, and causes a timeout on the submission request.

One solution is to make the submit operation from the front-end have a very very long timeout. There is a DOS danger - even when the request times out, it has still been queued at the servicex app - so it will eventually get around to submitting it. But the client never got back a request id due to the timeout, so it has to resubmit it.

Server side query caching would make this a more robust interaction, as another option to help with this.