ansys / pyfluent

Pythonic interface to Ansys Fluent
https://fluent.docs.pyansys.com
MIT License
275 stars 40 forks source link

launch_fluent scheduler_headnode option not used #3053

Closed cj-hodgson closed 2 weeks ago

cj-hodgson commented 3 months ago

🔍 Before submitting the issue

🐞 Description of the bug

The scheduler_headnode key in scheduler_options for launch_fluent on a SLURM scheduler does not appear to be used, rather fluent is started with the -cnf argument directed at a slurm.{nnnn}.hosts file. If the value of scheduler_headnode is incorrect the job still runs without error.

📝 Steps to reproduce

Call launch_fluent with scheduler_options={"scheduler": "slurm", "scheduler_headnode": ""} in a slurm environment

💻 Which operating system are you using?

Linux

📀 Which ANSYS version are you using?

24.2

🐍 Which Python version are you using?

3.10

📦 Installed packages

2to3==1.0
about-time==4.2.1
absl-py==2.1.0
alive-progress==3.1.5
ansys-api-fluent==0.3.26
ansys-api-platform-instancemanagement==1.1.0
ansys-api-tools-filetransfer==0.1.0
ansys-fluent-core==0.22.dev0
ansys-hpcservices-file-management-v1==23.1.2407.1
ansys-hpcservices-file-transfer-v1==23.1.2407.1
ansys-hpcservices-global-permission-v1==23.1.2407.1
ansys-hpcservices-hardware-configuration-clusters-v1==23.1.2407.1
ansys-hpcservices-hardware-configuration-endpoints-v1==23.1.2407.1
ansys-hpcservices-hardware-configuration-queues-v1==23.1.2407.1
ansys-hpcservices-hardware-configuration-storages-v1==23.1.2407.1
ansys-hpcservices-job-management-jobs-v1==23.1.2407.1
ansys-hpcservices-job-management-logs-v1==23.1.2407.1
ansys-hpcservices-job-management-templates-v1==23.1.2407.1
ansys-hpcservices-service-management-v1==23.1.2407.1
ansys-hpcservices-user-management-v1==23.1.2407.1
ansys-platform-instancemanagement==1.1.2
ansys-pythonnet==3.1.0rc0
ansys-tools-filetransfer==0.1.0
ansys-units==0.3.2
asttokens==2.4.1
astunparse==1.6.3
attrs==23.2.0
bcrypt==4.1.3
beartype==0.18.5
blinker==1.8.2
certifi==2023.11.17
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
clr-loader==0.2.6
contourpy==1.2.1
cryptography==42.0.8
cycler==0.12.1
dash==2.17.1
dash-bootstrap-components==1.6.0
dash-core-components==2.0.0
dash-html-components==2.0.0
dash-table==5.0.0
decorator==5.1.1
distro==1.9.0
docker==6.1.3
et-xmlfile==1.1.0
exceptiongroup==1.2.1
execnet==2.1.1
executing==2.0.1
Flask==3.0.3
flatbuffers==24.3.25
fonttools==4.53.1
future==0.18.0
gast==0.6.0
google-pasta==0.2.0
googleapis-common-protos==1.52.0
grapheme==0.6.0
grpcio==1.64.1
grpcio-health-checking==1.48.1
grpcio-status==1.26.0
h5py==3.11.0
idna==3.7
imageio==2.34.2
importlib_metadata==8.0.0
iniconfig==2.0.0
ipython==8.26.0
itsdangerous==2.2.0
jedi==0.19.1
Jinja2==3.1.4
joblib==1.4.2
kaleido==0.2.1
keras==3.4.1
kiwisolver==1.4.5
libclang==18.1.1
llvmlite==0.39.1
lxml==4.9.4
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.1
matplotlib-inline==0.1.7
mdurl==0.1.2
ml-dtypes==0.3.2
namex==0.0.8
nest-asyncio==1.6.0
nltk==3.8.1
numba==0.56.4
numpy==1.23.5
nvidia-nccl-cu12==2.22.3
openpyxl==3.1.5
opt-einsum==3.3.0
optree==0.12.1
packaging==24.1
pandas==2.0.3
paramiko==3.4.0
paramiko-expect==0.3.5
parso==0.8.4
pexpect==4.9.0
pillow==10.4.0
platformdirs==3.11.0
plotly==5.22.0
pluggy==1.5.0
prompt_toolkit==3.0.47
protobuf==4.25.3
psutil==5.9.2
ptyprocess==0.7.0
pure-eval==0.2.2
py==1.11.0
pybind11==2.10.0
pycparser==2.22
Pygments==2.18.0
PyNaCl==1.5.0
pyparsing==3.1.2
pytest==7.2.1
pytest-forked==1.6.0
pytest-xdist==1.31.0
python-certifi-win32==1.6.1
python-dateutil==2.9.0.post0
python-pptx==0.6.23
pytz==2024.1
PyYAML==6.0.2rc1
regex==2024.5.15
requests==2.31.0
retrying==1.3.4
rich==13.7.1
scikit-learn==1.5.1
scipy==1.9.3
SCons==4.3.0
seaborn==0.13.2
setuptools-scm==8.1.0
shapely==2.0.2
sip==6.5.1
six==1.16.0
stack-data==0.6.3
tenacity==8.5.0
tensorboard==2.16.2
tensorboard-data-server==0.7.2
tensorflow==2.16.2
tensorflow-io-gcs-filesystem==0.37.1
termcolor==2.4.0
threadpoolctl==3.5.0
toml==0.10.2
tomli==2.0.1
tqdm==4.66.4
traitlets==5.14.3
tzdata==2024.1
urllib3==1.26.10
wcwidth==0.2.13
websocket-client==1.8.0
Werkzeug==3.0.3
wrapt==1.16.0
wxPython==4.1.1
xgboost==2.1.0
XlsxWriter==3.2.0
zipp==3.19.2
mkundu1 commented 3 months ago

@cj-hodgson Are you seeing any error like ssh: connect to host <headnode> port 22: failure when a non-empty headnode is provided? I think empty strings are ignored by Fluent.

cj-hodgson commented 3 months ago

@mkundu1 This is the ssh error I get for a non-empty headnode: ssh: Could not resolve hostname nonsense: Temporary failure in name resolution. I've found that if the headnode value is incorrect but the value contains a valid headnode hostname, then that is used. For example if 'mymachine' is a slurm headnode, the job runs if scheduler_headnode is set to 'mymachine01'

mkundu1 commented 2 weeks ago

@cj-hodgson Empty strings in the scheduler_headnode field are ignored by Fluent. In Fluent side, fluent 3ddp -scheduler=slurm -scheduler_headnode= -gu, starts the slurm instance in the current node without any error.

I've found that if the headnode value is incorrect but the value contains a valid headnode hostname, then that is used.

I cannot reproduce the issue in both PyFuent and standalone Fluent. I get the ssh error in both cases. Please share the machine details in chat where you've observed the issue.