SimFlowCFD / RapidCFD-dev

RapidCFD is an OpenFOAM fork running fully on CUDA platform. Brought to you by
https://sim-flow.com
Other
325 stars 93 forks source link

Encountered a new error that I have no idea how to fix #62

Open Xw2X opened 5 years ago

Xw2X commented 5 years ago

Hello everyone,

while running a simulation I came across an error that I do not know what it means at all. Can anyone point me to the right direction?

The simulation is a simple square prism with water inside. The tank is set to oscillate left and right to simulate sloshing. The idea is to study how fluid behaves while sloshing. However I came across this error before the simulation can start.

/---------------------------------------------------------------------------\ | RapidCFD by simFlow (sim-flow.com) | *---------------------------------------------------------------------------*/ Build : dev-f3775ac96129 Exec : interDyMFoam Date : May 21 2019 Time : 09:50:48 Host : "1070-test" PID : 14552 Case : /root/SSH/sloshing3/sloshing3/sloshing3 nProcs : 1 sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster allowSystemOperations : Allowing user-supplied system call operations

// * // Create time

Create mesh for time = 0

Selecting dynamicFvMesh dynamicMotionSolverFvMesh Selecting motion solver: displacementLaplacian

--> FOAM FATAL IO ERROR: unexpected class name vectorField expected vectorgpuField while reading object points

file: /root/SSH/sloshing3/sloshing3/sloshing3/constant/polyMesh/points at line 17.

--> FOAM Warning : From function regIOobject::readStream(const word&) in file db/regIOobject/regIOobjectRead.C at line 136.

FOAM exiting

Xw2X commented 5 years ago

I am also having problems with running simulations in parallel (multiple GPU)

the simulation seems to end early. The same simulation runs fine in series.

Here is the log:

--> FOAM Warning : From function Time:perator++() in file db/Time/Time.C at line 1055 Increased the timePrecision from 7 to 8 to distinguish between timeNames at time 0.2415173 MULES: Solving for alpha.Water MULES: Solving for alpha.Water Phase-1 volume fraction = 0.408125 Min(alpha1) = -1.459455e-06 Max(alpha1) = 1.000014 --> FOAM Warning : From function Time:perator++() in file db/Time/Time.C at line 1055 Increased the timePrecision from 8 to 10 to distinguish between timeNames at time 0.2415173 MULES: Solving for alpha.Water MULES: Solving for alpha.Water Phase-1 volume fraction = 0.408125 Min(alpha1) = -1.193857e-06 Max(alpha1) = 1.000027 --> FOAM Warning : From function Time:perator++() in file db/Time/Time.C at line 1055 Increased the timePrecision from 10 to 11 to distinguish between timeNames at time 0.2415173 [0] [1] #0 Foam::error:rintStack(Foam::Ostream&)#0 Foam::error:rintStack(Foam::Ostream&) at ??:? [0] #1 Foam::sigFpe::sigHandler(int) at ??:? [1] #1 Foam::sigFpe::sigHandler(int) at ??:? [0] #2 at ??:? [1] #2 in "/lib64/libc.so.6" [1] #3 Foam::PBiCG::solve(Foam::gpuField&, Foam::gpuField const&, unsigned char) const in "/lib64/libc.so.6" [0] #3 Foam::PBiCG::solve(Foam::gpuField&, Foam::gpuField const&, unsigned char) const at ??:? [1] #4 at ??:? [0] #4

[0] at ??:? [0] #5 [1] at ??:? [1] #5

[0] at ??:? [0] #6 [1] at ??:? [1] #6

[1] at ??:? [1] #7 [0] at ??:? [0] #7

[1] at ??:? [1] #8 libc_start_main[0] at ??:? [0] #8 libc_start_main in "/lib64/libc.so.6" [1] #9 in "/lib64/libc.so.6" [0] #9

[1] at ??:? [1070-test:02056] Process received signal [1070-test:02056] Signal: Floating point exception (8) [1070-test:02056] Signal code: (-6) [1070-test:02056] Failing at address: 0x808 [0] at ??:? [1070-test:02055] Process received signal [1070-test:02055] Signal: Floating point exception (8) [1070-test:02055] Signal code: (-6) [1070-test:02055] Failing at address: 0x807 [1070-test:02055] [ 0] /lib64/libc.so.6(+0x36280)[0x7f44a339f280] [1070-test:02055] [ 1] [1070-test:02056] [ 0] /lib64/libc.so.6(+0x36280)[0x7f5834b7c280] [1070-test:02056] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x7f5834b7c207] [1070-test:02056] [ 2] /lib64/libc.so.6(+0x36280)[0x7f5834b7c280] [1070-test:02056] [ 3] /root/RapidCFD/RapidCFD-dev/platforms/linux64NvccDPOpt/lib/libOpenFOAM.so(_ZNK4Foam5PBiCG5solveERNS_8gpuField IdEERKS2_h+0x65c)[0x7f58361941cc] [1070-test:02056] [ 4] interFoam[0x508a9e] [1070-test:02056] [ 5] interFoam[0x533e07] [1070-test:02056] [ 6] interFoam[0x534088] [1070-test:02056] [ 7] interFoam[0x4777bf] [1070-test:02056] [ 8] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f5834b683d5] [1070-test:02056] [ 9] interFoam[0x47ac6b] [1070-test:02056] End of error message /lib64/libc.so.6(gsignal+0x37)[0x7f44a339f207] [1070-test:02055] [ 2] /lib64/libc.so.6(+0x36280)[0x7f44a339f280] [1070-test:02055] [ 3] /root/RapidCFD/RapidCFD-dev/platforms/linux64NvccDPOpt/lib/libOpenFOAM.so(_ZNK4Foam5PBiCG5solveERNS_8gpuField IdEERKS2_h+0x65c)[0x7f44a49b71cc] [1070-test:02055] [ 4] interFoam[0x508a9e] [1070-test:02055] [ 5] interFoam[0x533e07] [1070-test:02055] [ 6] interFoam[0x534088] [1070-test:02055] [ 7] interFoam[0x4777bf] [1070-test:02055] [ 8] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f44a338b3d5] [1070-test:02055] [ 9] interFoam[0x47ac6b] [1070-test:02055] End of error message

mpiexec noticed that process rank 1 with PID 2056 on node 1070-test exited on signal 8 (Floating point exception).

TonkomoLLC commented 5 years ago

To your first question: I am not sure why your sloshing tank fails. I adapted an OpenFOAM 2.3.x tutorial here. This works on my RapidCFD setup. Hopefully this also works on your system, and then you can troubleshoot to figure out why polyMesh/points triggers the error.

As I understand the error, RapidCFD was expecting some data input in the form of the vectorgpuField type, but instead got a vectorField type. However, I have never had problems with RapidCFD accepting the points file in the polyMesh directory. Hopefully nothing is damaged with your polyMesh grid.

Hope this helps. Good luck with your troubleshooting.

Best regards, Eric

TonkomoLLC commented 5 years ago

On the matter of running in parallel, I have seen my own RapidCFD cases too that run fine in series, but then fail in parallel. I do not know the underlying reason, but pragmatically I have fixed this in every occasion by experimenting with the domain decomposition (decomposeParDict), changing the number of processors in the x, y and z dimensions until I can run successfully.

Sometimes it can take me a while to find the right decomposeParDict settings.

Sorry I don't have a better solution. If you solve this problem please let me know.

Best regards,

Eric

Xw2X commented 5 years ago

On the matter of running in parallel, I have seen my own RapidCFD cases too that run fine in series, but then fail in parallel. I do not know the underlying reason, but pragmatically I have fixed this in every occasion by experimenting with the domain decomposition (decomposeParDict), changing the number of processors in the x, y and z dimensions until I can run successfully.

Sometimes it can take me a while to find the right decomposeParDict settings.

Sorry I don't have a better solution. If you solve this problem please let me know.

Best regards,

Eric

Thank you so so much for your input and example problems. I will fiddle around with the numbers.

Xw2X commented 5 years ago

To your first question: I am not sure why your sloshing tank fails. I adapted an OpenFOAM 2.3.x tutorial here. This works on my RapidCFD setup. Hopefully this also works on your system, and then you can troubleshoot to figure out why polyMesh/points triggers the error.

As I understand the error, RapidCFD was expecting some data input in the form of the vectorgpuField type, but instead got a vectorField type. However, I have never had problems with RapidCFD accepting the points file in the polyMesh directory. Hopefully nothing is damaged with your polyMesh grid.

Hope this helps. Good luck with your troubleshooting.

Best regards, Eric

Hi Eric, how can i run your coil-pisoFoam file for speed testing?

TonkomoLLC commented 5 years ago

Hi, I did not run the speed tests on coil-pisoFoam. cfd-online member chengtun did this here.

Not knowing his specific procedure, here is what I think he did:

  1. he ran the case that he "donated" to the RapidCFD test repository with one GPU
  2. he ran the same case presumably on OpenFOAM 2.3.x and 1 CPU

For both RapidCFD and OpenFOAM 2.3.x, the solver is "pisoFoam"

That is, after source /opt/RapidCFD-dev/etc/bashrc (assuming this is where your RapidCFD is installed), go to the case directory and run pisoFoam

To run with a CPU, you can copy the coil-pisoFoam case and from that new directory, after source /opt/OpenFOAM-2.3.x/etc/bashrc, from the new case directory you can run pisoFoam.

Chengtun saved the elapsed time for GPU and CPU cases, and plotted at the URL referenced above.

I hope this gets you going with your benchmark. Sharing your results (and pertinent information, CPU type, GPU type, any other important facts) is helpful for the community.

Best regards,

Eric

Xw2X commented 5 years ago

Hi, I did not run the speed tests on coil-pisoFoam. cfd-online member chengtun did this here.

Not knowing his specific procedure, here is what I think he did:

  1. he ran the case that he "donated" to the RapidCFD test repository with one GPU
  2. he ran the same case presumably on OpenFOAM 2.3.x and 1 CPU

For both RapidCFD and OpenFOAM 2.3.x, the solver is "pisoFoam"

That is, after source /opt/RapidCFD-dev/etc/bashrc (assuming this is where your RapidCFD is installed), go to the case directory and run pisoFoam

To run with a CPU, you can copy the coil-pisoFoam case and from that new directory, after source /opt/OpenFOAM-2.3.x/etc/bashrc, from the new case directory you can run pisoFoam.

Chengtun saved the elapsed time for GPU and CPU cases, and plotted at the URL referenced above.

I hope this gets you going with your benchmark. Sharing your results (and pertinent information, CPU type, GPU type, any other important facts) is helpful for the community.

Best regards,

Eric

Thank you for all your help Eric,

Can you help me identify which file specifically tells RAPIDCFD to solve using GPU instead of CPU?

I tried several file from the tutorial of openfoam but they all seem to be running on the CPU instead. The only file I was able to use the GPU was the 3D slosh tank I got from you.

TonkomoLLC commented 5 years ago

Hello,

After you source /opt/RapidCFD-dev/etc/bashrc (adjust according to your install location), the RapidCFD solvers will be used (and thus your calculations will use the GPU).

You can confirm that RapidCFD is being used by typing which icoFoam. If the path includes your RapidCFD install directory, then you're using RapidCFD.

If you're using RapidCFD solvers with your case, then there's nothing else you need to do to ensure the GPU is utilized.

Hope this advice moves you forward.

Best regards,

Eric

Xw2X commented 5 years ago

Hi Eric,

I did not compile rapidCFD in the current machine I am working on however I did notice that the directory of icofoam is located in a rapidCFD solver folder. It is standalone on its own. It is not in the rapidCFD-dev folder. That might be the reason why some of the simulation is not running on the GPU. Any idea on what I can do to utilize the GPU if I am using the rapidcfd solvers?

TonkomoLLC commented 5 years ago

Hi,

I am not sure how to help.

On my system, after I type:

source /opt/RapidCFD-dev/etc/bashrc

I can check:

which icoFoam

and the system reports:

/opt/RapidCFD-dev/platforms/linux64NvccDPOpt/bin/icoFoam

Then, if everything is compiled correctly for your machine, and you run icoFoam, you'll see the RapidCFD by simFlow (sim-flow.com) banner.

If you see that RapidCFD banner when executing icoFOAM, then you're using a solver that is compiled for use with your GPU. No other special steps are needed to utilize the GPU when running a RapidCFD solver.

Best regards,

Eric