Closed alvarorsant closed 1 year ago
Hey @alvarorsant ,
That's a pretty cool idea.
Right now, there are a number of queues which would be valuable to expose through metrics. From the top of my head, I could think of the following points (the order represents when they are hit in the processing flow):
Ideally, it would be useful to instrument all 4 points, so that we know how many requests are currently being held on each one. Additionally, we could also have a 5th overarching count of many requests are currently being processed within MLServer (i.e. all requests currently between step 1.
and 4.
).
It would be great to hear your thoughts on that approach @alvarorsant .
We have already found point 2 and 3, but could you specify the exact location of point 1 and 4 in code? We are preparing a PR for you indeed :) and we would need accurate indications.
Thanks in advance.
Hey @alvarorsant ,
That's great to hear! Thanks a lot for taking the lead in this one! :rocket:
For points 1.
and 4.
, this would be within Python's own runtime. When called, AsyncIO methods in Python get all converted into "tasks" and scheduled for execution into a list, which gets iterated through by an event loop. The longer this task list becomes, the longer it will take to execute each scheduled task.
I haven't looked much into this one TBH, but there seem to be a few suggestions here of how to approach this one:
https://splunktool.com/how-can-i-measure-the-length-of-an-asyncio-event-loop
i.e. mainly doing something like:
len(asyncio.all_tasks(asyncio.get_running_loop()))
@adriangonz We opened this PR: https://github.com/SeldonIO/MLServer/pull/769 in which we added the metrics described in points 2 and 3 (Batch and Pool)
Hi @adriangonz, I continue with @miguelopind's task. I opened this PR: https://github.com/SeldonIO/MLServer/pull/826, implementing points: 2,3,4. For implementing point 1, when you said 'AsyncIO "task list" within the main process', what do u exactly mean? Where is supposed it would be located that metric? Is it possible to monitor several processes in the main as labels?
Thanks in advance.
BTW, tell if the PR is ok for you or It'd need any modification.
Hey @alvarorsant,
Regarding your question, did you have a look at my previous comment (https://github.com/SeldonIO/MLServer/issues/728#issuecomment-1266739392)? Your main questions seem covered there, but please do let me know if there are any extra points you'd like to clarify.
Thanks for that PR BTW :+1: What should happen with the previous one though (https://github.com/SeldonIO/MLServer/pull/769)? If that one is not relevant anymore, could you sync with @miguelopind and remove it?
Hello, @miguelopind left the company several days ago and he handed over the task but I don't have enough rights to close https://github.com/SeldonIO/MLServer/pull/769. I don't know if you have rights to do it.
Thanks a lot.
Hello @alvarorsant I can do it if you want, just tell me.
Yes, @miguelopind, can you close it? I have already done a new PR gathering all the changes.
Thanks!
@alvarorsant done
Hi Adrian, I've done a new PR https://github.com/SeldonIO/MLServer/pull/860 considering the changes you said (focused only in bacth queue and request queue with several tests)
Hey @alvarorsant ,
Thanks for making those changes.
Out of curiosity, is there any reason why you didn't update the existing PR instead? Either way, if #860 is now the up-to-date one, could you remove #826?
Fixed by #860
Hi, we would like the number of the elements within the request queue in pool inside of a metric, for performance issues. It's a good idea to get this data to tune the system increasing or decreasing parallel workers and pods in Openshift. What do you think?