Azure / hpcpack

The repo to track public issues for Microsoft HPC Pack product.
MIT License
29 stars 11 forks source link

MPIEXEC stops Error (122) The data area passed to a system call is too small. #23

Open weidi opened 2 years ago

weidi commented 2 years ago

Problem Description

No matter what binary i try to start using msmpiexec they fail with this error I see this with Ansys, 6Sigma and also trying to start mpipingpong

Steps to Reproduce

Install HPC Pack 2019SP1 on Server 2019 (Head node) Install HPC Pack 2019SP1 on Server 2022 (Compute node)

Expected Results

Actual Results

[2022-07-08T06:28:46.560Z]: Writing initial data... [2022-07-08T06:28:46.985Z]: Constructing CFD geometry ... [2022-07-08T06:29:09.285Z]: Verifying Model [2022-07-08T06:29:10.873Z]: Running CFD solution... [2022-07-08T06:29:13.412Z]: Creating distributed data for parallel solver... [2022-07-08T06:29:25.540Z]: Running parallel solver... [2022-07-08T06:29:25.798Z]: Error: ERROR: Error reported: failed to launch \CFDSolver.exe (truncated pathes) [2022-07-08T06:29:25.806Z]: Error: Error (122) The data area passed to a system call is too small. [2022-07-08T06:29:28.643Z]: Error: Parallel MPI Specific error exit code : -4 [2022-07-08T06:29:28.656Z]: Solving process failure. [2022-07-08T06:29:28.664Z]: Removing distributed data. [2022-07-08T06:29:29.362Z]: Updating model from CFD solution... [2022-07-08T06:29:31.705Z]: CFDSolution complete [2022-07-08T06:29:32.311Z]: Solving completed with exit code 104.

Additional Logs

INSERT ADDITIONAL LOGS HERE

Additonal Comments

weidi commented 2 years ago
ERROR: Error reported: failed to launch 'mpipingpong -r -p 4096:1600' on HMU45
Error (122) The data area passed to a system call is too small. 
ERROR: Error reported: failed to launch 'mpipingpong -r -p 4096:1600' on HMU47
Error (122) The data area passed to a system call is too small. 
weidi commented 2 years ago

As an update, i reinstalled the W2022 machines with W2019 and they are working flawless. Seems something new with 2022

YutongSun commented 2 years ago

Hi weidi,

I tried mpipingpong on windows server 2022 with HPC Pack 2019 Update 1 and succeeded. Was it OS configuration problem?

'job submit /numnodes:2 mpiexec -c 1 mpipingpong -p 1:100 -op -s nul'

Edition Windows Server 2022 Datacenter Version 21H2 OS build 20348.473

weidi commented 2 years ago

I have really no clue what it was in the end. as i had to get it going i went with 2019. Had the same issue with other MPI apps as well so i´d say it´s down to MPI or driver implementation on 2022 (Standard patched latest and greatest)