Open richscott opened 4 months ago
Hi, I am interested in this issue.
we want to answer the question "How much resource is being used by a user or group? for a specific duration of time?"
/ws/v1/partition/{partitionName}/usage/users
/ws/v1/partition/{partitionName}/usage/user/{userName}
/ws/v1/partition/{partitionName}/usage/groups
/ws/v1/partition/{partitionName}/usage/group/{groupName}
These APIs return resource usage of queues in a hierarchical response. For our purpose, a similar response will not be useful because they do not consider historical resource usage. Also, they are serving resource usage of queues, but multiple users can deploy into the same queue. Also, the Queue creator might not be the user, who is deploying the application.
Sample Response of ynikorn-core
resource usage endpoints
[
{
"userName": "user1",
"groups": {
"app2": "tester"
},
"queues":
{
"queuePath": "root",
"resourceUsage": {...},
"children": [
{
"queuePath": "root.default",
"resourceUsage": {...},
"children": [
{
"queuePath": "root.default.test",
"resourceUsage": {
"memory": 6000000000,
"vcore": 6000
},
"children": [...]
}
]
}]
}
}
]
Scenario 1: Let's say, user-a and user-b both have submitted job to the same queue. Now if we get the resource usage of the queue, we will only get the total resource usage of the queue (referring to the current DB)
resource usage for each application is stored in the application
table. Also, the information of users and groups is tracked. So we might be able to get the resource usage of a user or group by querying the application
table.
Some useful columns in the application
table are:
user
queue_name
used_resource
max_used_resource
**pending_resource
state_log
So, we can simplify the response to the following format.[
{
"userName": "user1",
"groups": {
"app2": "tester"
},
"applications":
[
{
"queuePath": "root.default.test",
"app_id": "app1",
"maxUsedResource": {
"memory": 6000000000,
"vcore": 6000
}
}
]
}
]
Add a handler (and prerequisite database layer structures/code, if they don't already exist) to replicate the yunikorn-core endpoints
/ws/v1/partition/:partition/usage/users
/ws/v1/partition/:partition/usage/user/:user
/ws/v1/partition/:partition/usage/groups
/ws/v1/partition/:partition/usage/group/:group
to allow YHS users to get resource usage metrics for users and groups, at aggregate and individual levels.