This is often not a big issue, but when you are sequentially submitting more than 10 complete epidisco workflows to a single server, sending them all do take some time. Especially if/when you try to do that in parallel (to different servers or worse, to the same one).
I originally thought that this was more about server's taking time to do the equivalance checks on its side before officially putting the `OK stamp on it; but while I was playing with ketrew JSONs the other day, I realize that a single patient's workflow serialized in JSON format was ~30MB! I then try to send this via curl and it is not that the transfer gets completed and the server waits, but it is actually transferring the file that makes the waiting long.
The obvious solution, of course, is to support the gzipped content delivery from/to server/client which would be a natural extension of the HTTPs-based API you have designed. And there is also this:
$ du -sh all-in-epidisco-workflow.json*
29M all-in-epidisco-workflow.json
720K all-in-epidisco-workflow.json.gz
Poking around a bit, I gladly saw that Cohttp_lwt at least supports the header/response formats, so I think all we need is to get the logic of gzip/gzip-not into the pre-/post-serialization parts and we will have blazingly fast submission experiences from then on (unless of course we DDoS ketrew with all those decompression tasks, which by the way can be handled by another helper virtual machine in the container; but that is for another day :))
(Maybe you have already tried this and moved away, in that case, feel free to ignore this; but I would be curious about what went wrong there)
indeed equivalence checks are not impactful for the actual submission time
(ketrew stores the submssion all at once, answers the HTTP req, and then the
engine will do the equivalence + adding → that's the delay between the
notification "workflow received” and the presence in the node-table in the
WUI).
the equivalence computation is not exactly the bottleneck either (it the
DB interaction to get "equivalence candidates" + adding the workflow)
(even with ketrew compiled to bytecode the DB interactions are slower than
the pure-ocaml equivalence computation)
so yes gzipping is worth a try
I've also noticed that (esp with 30 to 300 MB submissions) there is a huge
difference between bytecode (when you run ocaml submit.ml it's
bytecode-compiled) and native executables.
I want to try OpenSSL Vs OCamlTLS in bytecode to see if the perf problem
is at that level.
to reduce the stress on the check-equivalence + add to engine, I'll try 2 things:
do the “adding” only one workflow at once (processing the whole queue at once
can pause the engine for quite long, which makes the user think something
is broken).
implement workflow “namespaces” (i.e. check equivalence only between nodes
that belong to a given user-defined subset of the currently active universe;
e.g. in Epidisco we can use the experiment-name or “biokepi-setup” as
independent namespaces).
This is often not a big issue, but when you are sequentially submitting more than 10 complete epidisco workflows to a single server, sending them all do take some time. Especially if/when you try to do that in parallel (to different servers or worse, to the same one).
I originally thought that this was more about server's taking time to do the equivalance checks on its side before officially putting the
`OK
stamp on it; but while I was playing with ketrew JSONs the other day, I realize that a single patient's workflow serialized in JSON format was ~30MB! I then try to send this viacurl
and it is not that the transfer gets completed and the server waits, but it is actually transferring the file that makes the waiting long.The obvious solution, of course, is to support the gzipped content delivery from/to server/client which would be a natural extension of the HTTPs-based API you have designed. And there is also this:
Poking around a bit, I gladly saw that
Cohttp_lwt
at least supports the header/response formats, so I think all we need is to get the logic of gzip/gzip-not into the pre-/post-serialization parts and we will have blazingly fast submission experiences from then on (unless of course we DDoS ketrew with all those decompression tasks, which by the way can be handled by another helper virtual machine in the container; but that is for another day :))(Maybe you have already tried this and moved away, in that case, feel free to ignore this; but I would be curious about what went wrong there)