Open blackwood821 opened 3 years ago
Here is a recent error from the metadata
service log in the global zone:
[2021-07-09T20:38:05.353Z] INFO: metadata/752543 on cn01: handled GET setup:messages (brand=joyent, zonename=7d38ff30-de00-46af-9a8b-16c41c6220e7, response=SUCCESS)
query: {
"cmd": "GET",
"arg": "setup:messages",
"req_v": 2,
"elapsed": 0.33317
}
[2021-07-09T20:38:30.234Z] DEBUG: metadata/752543 on cn01: createServersOnExistingZones for 38 zones
[2021-07-09T20:38:30.234Z] INFO: metadata/752543 on cn01: created zone metadata sockets on 0 / 38 zones took 0.000385944s (385.94us)
[2021-07-09T20:39:05.481Z] ERROR: metadata/752543 on cn01: could not set metadata (key "setup:messages") (brand=joyent, zonename=7d38ff30-de00-46af-9a8b-16c41c6220e7)
MultiError: first of 1 error: first of 1 error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded
at /usr/vm/node_modules/vasync/lib/vasync.js:99:14
at /usr/vm/node_modules/VM.js:16887:29
at /usr/vm/node_modules/VM.js:17072:13
at /usr/node/0.10/node_modules/async.js:240:13
at /usr/node/0.10/node_modules/async.js:144:21
at /usr/node/0.10/node_modules/async.js:237:17
at /usr/node/0.10/node_modules/async.js:600:34
at /usr/vm/node_modules/VM.js:17014:21
at /usr/vm/node_modules/VM.js:3690:9
at next (/usr/vm/node_modules/vasync/lib/vasync.js:843:6)
Caused by: MultiError: first of 1 error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded
at /usr/vm/node_modules/vasync/lib/vasync.js:99:14
at done (/usr/vm/node_modules/vminfod/client.js:697:9)
at vminfodEventStreamWatchForEventTimeout (/usr/vm/node_modules/vminfod/client.js:635:13)
at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
Caused by: Error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded
at vminfodEventStreamWatchForEventTimeout (/usr/vm/node_modules/vminfod/client.js:633:23)
at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
[2021-07-09T20:39:05.481Z] ERROR: metadata/752543 on cn01: first of 1 error: first of 1 error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded (brand=joyent, zonename=7d38ff30-de00-46af-9a8b-16c41c6220e7)
[2021-07-09T20:39:05.481Z] ERROR: metadata/752543 on cn01: Socket for 7d38ff30-de00-46af-9a8b-16c41c6220e7 closed before we could write anything. (brand=joyent, zonename=7d38ff30-de00-46af-9a8b-16c41c6220e7)
[2021-07-09T20:39:05.481Z] INFO: metadata/752543 on cn01: handled PUT setup:messages (brand=joyent, zonename=7d38ff30-de00-46af-9a8b-16c41c6220e7, response=FAILURE)
MultiError: first of 1 error: first of 1 error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded
at /usr/vm/node_modules/vasync/lib/vasync.js:99:14
at /usr/vm/node_modules/VM.js:16887:29
at /usr/vm/node_modules/VM.js:17072:13
at /usr/node/0.10/node_modules/async.js:240:13
at /usr/node/0.10/node_modules/async.js:144:21
at /usr/node/0.10/node_modules/async.js:237:17
at /usr/node/0.10/node_modules/async.js:600:34
at /usr/vm/node_modules/VM.js:17014:21
at /usr/vm/node_modules/VM.js:3690:9
at next (/usr/vm/node_modules/vasync/lib/vasync.js:843:6)
Caused by: MultiError: first of 1 error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded
at /usr/vm/node_modules/vasync/lib/vasync.js:99:14
at done (/usr/vm/node_modules/vminfod/client.js:697:9)
at vminfodEventStreamWatchForEventTimeout (/usr/vm/node_modules/vminfod/client.js:635:13)
at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
Caused by: Error: vminfod watchForEvent "VM.js writeAllMetadata (7d38ff30-de00-46af-9a8b-16c41c6220e7: customer_metadata,internal_metadata,tags)" timeout exceeded
at vminfodEventStreamWatchForEventTimeout (/usr/vm/node_modules/vminfod/client.js:633:23)
at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
--
query: {
"cmd": "PUT",
"arg": "setup:messages",
"req_v": 2,
"elapsed": 60120.467837
}
I don't see any errors in the vminfod
service but that seems to be the one that needs to be restarted each time.
We have an upgrade routine where after we reprovision a VM to a new image our
user-script
runs an upgrade process on the zone. Every once in a while our script fails because it runs anmdata-put
command that times out. Here is an example outside of our script that recreates the issue:Restarting the
vminfod
service viasvcadm restart vminfod
on the CN global always seems to fix the issue.There is more info available at https://gist.github.com/blackwood821/d343ee38d76a44f5ebb1429828787199.