foundation-model-stack / multi-nic-cni

https://foundation-model-stack.github.io/multi-nic-cni/
Apache License 2.0
34 stars 5 forks source link

docs: connection-check expected output note #25

Closed MEllis-github closed 1 year ago

MEllis-github commented 2 years ago

Minor suggestion with respect to the expected output shown here https://github.com/foundation-model-stack/multi-nic-cni/tree/main/connection-check and here https://github.com/foundation-model-stack/multi-nic-cni#check-connections: state the assumption that no jobs using the same network are running when the check is performed. Otherwise the output is more similar to the following until other jobs are brought down.

bash-3.2$  kubectl logs job/multi-nic-concheck
...
2022/10/05 21:55:08 45/45 servers successfully created
2022/10/05 21:55:09 vlanl3-a100-large-drlfv-worker-3-with-secondary-7lg74-serve: Pending
2022/10/05 21:55:20 vlanl3-a100-large-drlfv-worker-3-with-secondary-txxjg-serve: Pending
2022/10/05 21:55:43 Some job is still running: vlanl3-a100-large-drlfv-worker-3-with-secondary-l8cfw-clien
2022/10/05 21:55:48 Some job is still running: vlanl3-a100-large-drlfv-worker-3-with-secondary-l8cfw-clien
2022/10/05 21:55:53 Some job is still running: vlanl3-a100-large-drlfv-worker-3-with-secondary-l8cfw-clien
...
sunya-ch commented 1 year ago

Done by https://github.com/foundation-model-stack/multi-nic-cni/pull/134