Azure / azurehpc

This repository provides easy automation scripts for building a HPC environment in Azure. It also includes examples to build e2e environment and run some of the key HPC benchmarks and applications.
MIT License
124 stars 66 forks source link

Added netstat check #691

Closed vanzod closed 2 years ago

vanzod commented 2 years ago

At times check_ib_bw_gdr fails with the following error:

Couldn't listen to port 18515
Unable to open file descriptor for socket connection Unable to init the socket connection
Couldn't connect to slurmcluster-hpc-pg0-277:18515
Unable to open file descriptor for socket connection Unable to init the socket connection

Adding netstat command to provide additional insights on port status.