mlbench / mlbench-old

!!!!!DEPRECATED!!!! distributed machine learning benchmark - a public benchmark of distributed ML solvers and frameworks
Apache License 2.0
40 stars 8 forks source link

Improve asynchronous metrics posting module #40

Open liehe opened 6 years ago

liehe commented 6 years ago

For the moment, in mlbench/refimpls/utils/log.py subprocesses are used to post information asynchronously. This is however need to be improved using native Kubernetes modules.

A possible solution is connect_post_namespaced_service_proxy in CoreV1API. But it is not optimal as it does not accept body argument.

Panaetius commented 6 years ago

In the backlog for now. If the python client ever supports proper post requests, this makes sense again.

The biggest use case is if we have multiple api/dashboard nodes with loadbalancing (if the dashboard ever becomes the limiting factor, e.g. with 1000 Nodes), then we need to sue the kubernetes API with its Load Balancer.

For now it's of little importance, though.