Closed 107dipan closed 2 years ago
I'm not able to reproduce this.
With vespaengine/vespa:7.542.42 :
curl localhost:8080/metrics/v2/values/ -s |jq
{
"nodes": [
{
"hostname": "vespa-container",
"role": "hosts/vespa-container",
"services": [
{
"name": "vespa.container",
"timestamp": 1645035420,
"status": {
"code": "up",
"description": "Data collected successfully"
},
"metrics": [
{
"values": {
"memory_virt": 4368171008,
"memory_rss": 2006052864,
"cpu": 14.810705866038,
"cpu_util": 1.8513382332547
},
"dimensions": {
"serviceId": "container"
}
},
...
Let me try with 7.542.42. Just want to confirm if I only need to deploy the application.zip with this vespa version or I will also need to restart the services after deploying.
There is no difference between 7.452. and 7.543. on this. When upgrading software you need to restart the process to take effect, deploying configuration and schema via deploy does not do that for you.
Can I run the verpa-stop-services && verspa-start-services commands to restart the services? Do I need to run this command in all of my vespa pods or just the config node?
You need to install the software on each Vespa pod (node, or what you want to call it) and restart all of them, just upgrading the configuration nodes does not install the software on the pods.
Can I run the vespa-stop-services && verspa-start-services commands to restart the services?
These commands only stop the processes running locally on the node, not cluster-wide. You typically want to do that orchestrated in a production environment.
Does vespa give us any tools/ways for this type of orchestration?
The current best practice for deployment and upgrade orchestration is https://cloud.vespa.ai/. See also https://docs.vespa.ai/en/operations/live-upgrade.html
Thanks a lot!
I'm resolving this, my best guess is that you have not upgraded the process, or restarted the process after the upgrade. cpu_util metric was added maybe a month or two ago.
Yes, We need to restart all the processes. Thanks a lot for your help!
Describe the bug Unable to see cpu util in the /metrics/v2/values. I am getting the cpu in metrics.values but I am not getting the cpu.util values.
Reproduce I am using the container:8080/metrics/v2/values api and checking in the nodes array of the returned json
Expected behavior Metrics api should return the cpu util value.
Screenshots In the json object metrics.values I am only getting the following values memory_virt, memory_rss, cpu.
Environment (please complete the following information):
Vespa version 7.543