justAPhDStudent commented 1 year ago

Hello,

I was wondering how to do this rendering mode you mentioned in the code description: "if no monitor is available (like on a remote Linux server), there is an ASCII rendering mode to interactively visualize the simulation in the terminal (even in WSL and/or through SSH)"

I am on Microsoft Azure (a remote Linux machine), do you have some kind of documentation for this scenario?

Kind regards, Tom

ProjectPhysX commented 1 year ago

Hi @justAPhDStudent,

to enable the ASCII rendering mode, uncomment INTERACTIVE_GRAPHICS_ASCII. Upon simulation startup it takes a few seconds to initialize the color table. Keaboard controls work just like with the regular interactive graphics option, but mouse controls don't work.

Kind regards, Moritz

justAPhDStudent commented 1 year ago

Hello,

Since I work with Azure, I think I will post all related run time errors here. When running the aerodynamics of a cow setup with the necessary defines activated, I am getting a segmentation error.

|-----------------------------------------------------------------------------| | Info: Unit Conversion: 1 cell = 8.708 mm, 1 s = 1148 time steps | | Info: Re = 162162 | |----------------.------------------------------------------------------------| | Device ID 0 | Tesla T4 | |----------------'------------------------------------------------------------| |----------------.------------------------------------------------------------| | Device ID | 0 | | Device Name | Tesla T4 | | Device Vendor | NVIDIA Corporation | | Device Driver | 530.30.02 | | OpenCL Version | OpenCL C 1.2 | | Compute Units | 40 at 1590 MHz (2560 cores, 8.141 TFLOPs/s) | | Memory, Cache | 15984 MB, 1280 KB global / 48 KB local | | Buffer Limits | 3996 MB global, 64 KB constant | |----------------'------------------------------------------------------------| | Info: OpenCL C code successfully compiled. | | Info: Allocating memory. This may take a few seconds. | | Info: Loading "/shared/home/azureuser/lbm/FluidX3D/bin/../stl/Cow_t.stl" | | with 99994 triangles. | |-----------------.-----------------------------------------------------------| | Grid Resolution | 212 x 424 x 212 = 19056256 | | Grid Domains | 1 x 1 x 1 = 1 | | LBM Type | D3Q19 SRT (FP32/FP16S) | | Memory Usage | CPU 308 MB, GPU 1x 1012 MB | | Max Alloc Size | 690 MB | | Time Steps | 0 | | Kin. Viscosity | 0.00016995 | | Relaxation Time | 0.50050984 | | Reynolds Number | Re < 720188 | |---------.-------'-----.-----------.-------------------.---------------------| | MLUPs | Bandwidth | Steps/s | Current Step | Time Remaining | ./make.sh: line 12: 8639 Segmentation fault (core dumped) ./bin/FluidX3D "$@"

Do you have an idea why this happens?

Possible reason>>> Azure VMs don't have all the required packages that are normally found on Ubuntu. My version of Ubuntu is 18.04 and I think it is the absence of some packages that's triggering this issue.

Can you advise me what are the prerequisite packages a Linux machine should install before running problems in ENABLE_GRAPHICS mode?

Kind regards, Tom

ProjectPhysX commented 1 year ago

Hi @justAPhDStudent,

this is odd and should not happen. The application compiles and starts yet segfaults during initialization when the first OpenCL kernels are queued. To see if this is a driver issue or an application issue, can you test the basic benchmark case? This is FluidX3D with all extensions disabled. If this segfaults too, something is wrong with OpenCL drivers. On a similar note, Google Colab for some reason wiped OpenCL from their default driver installation, and you have to purge and reinstall Nvidia drivers to get it running.

Kind regards, Moritz

justAPhDStudent commented 1 year ago

Hello Moritz,

Thanks for your feedback. The benchmark test ran successfully. There is no problem of segmentation fault with the benchmark problem. The problem is happening only for problems which load an STL file it seems.

Kind regards, Tom

ProjectPhysX commented 1 year ago

Can you share the contents of your modified defines.hpp and, if modified, the main_setup() function? I'll try to reproduce the bug.

justAPhDStudent commented 1 year ago

Hello,

FYReference: Benchmark case output>> | Device ID | 0 | | Device Name | Tesla T4 | | Device Vendor | NVIDIA Corporation | | Device Driver | 530.30.02 | | OpenCL Version | OpenCL C 1.2 | | Compute Units | 40 at 1590 MHz (2560 cores, 8.141 TFLOPs/s) | | Memory, Cache | 15984 MB, 1280 KB global / 48 KB local | | Buffer Limits | 3996 MB global, 64 KB constant | |----------------'------------------------------------------------------------| | Info: OpenCL C code successfully compiled. | | Info: Allocating memory. This may take a few seconds. | |-----------------.-----------------------------------------------------------| | Grid Resolution | 256 x 256 x 256 = 16777216 | | Grid Domains | 1 x 1 x 1 = 1 | | LBM Type | D3Q19 SRT (FP32/FP32) | | Memory Usage | CPU 272 MB, GPU 1x 1488 MB | | Max Alloc Size | 1216 MB | | Time Steps | 10 | | Kin. Viscosity | 1.00000000 | | Relaxation Time | 3.50000000 | | Reynolds Number | Re < 148 | |---------.-------'-----.-----------.-------------------.---------------------| | MLUPs | Bandwidth | Steps/s | Current Step | Time Remaining | | 1806 | 276 GB/s | 108 | 9998 80% | 0s | |---------'-------------'-----------'-------------------'---------------------| | Info: Peak MLUPs/s = 1807

defines.hpp diff >>

azureuser@ip-0A010004:~/lbm/FluidX3D$ git diff src/defines.hpp diff --git a/src/defines.hpp b/src/defines.hpp index 962d782..b8c266c 100644 --- a/src/defines.hpp +++ b/src/defines.hpp @@ -10,23 +10,23 @@

define SRT // choose single-relaxation-time LBM collision operator; (default)

//#define TRT // choose two-relaxation-time LBM collision operator

-//#define FP16S // compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32 +#define FP16S // compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32 //#define FP16C // compress LBM DDFs to more accurate custom FP16C format; number conversion is emulated in software; all arithmetic is still done in FP32

-#define BENCHMARK // disable all extensions and setups and run benchmark setup instead +//#define BENCHMARK // disable all extensions and setups and run benchmark setup instead

//#define VOLUME_FORCE // enables global force per volume in one direction (equivalent to a pressure gradient); specified in the LBM class constructor; the force can be changed on-the-fly between time steps at no performance cost //#define FORCE_FIELD // enables computing the forces on solid boundaries with lbm.calculate_force_on_boundaries(); and enables setting the force for each lattice point independently (enable VOLUME_FORCE too); allocates an extra 12 Bytes/cell -//#define EQUILIBRIUM_BOUNDARIES // enables fixing the velocity/density by marking cells with TYPE_E; can be used for inflow/outflow; does not reflect shock waves +#define EQUILIBRIUM_BOUNDARIES // enables fixing the velocity/density by marking cells with TYPE_E; can be used for inflow/outflow; does not reflect shock waves //#define MOVING_BOUNDARIES // enables moving solids: set solid cells to TYPE_S and set their velocity u unequal to zero //#define SURFACE // enables free surface LBM: mark fluid cells with TYPE_F; at initialization the TYPE_I interface and TYPE_G gas domains will automatically be completed; allocates an extra 12 Bytes/cell //#define TEMPERATURE // enables temperature extension; set fixed-temperature cells with TYPE_T (similar to EQUILIBRIUM_BOUNDARIES); allocates an extra 32 (FP32) or 18 (FP16) Bytes/cell -//#define SUBGRID // enables Smagorinsky-Lilly subgrid turbulence LES model to keep simulations with very large Reynolds number stable +#define SUBGRID // enables Smagorinsky-Lilly subgrid turbulence LES model to keep simulations with very large Reynolds number stable //#define PARTICLES // enables particles with immersed-boundary method (for 2-way coupling also activate VOLUME_FORCE and FORCE_FIELD; only supported in single-GPU)

//#define INTERACTIVE_GRAPHICS // enable interactive graphics; start/pause the simulation by pressing P; either Windows or Linux X11 desktop must be available; on Linux: change to "compile on Linux with X11" command in make.sh //#define INTERACTIVE_GRAPHICS_ASCII // enable interactive graphics in ASCII mode the console; start/pause the simulation by pressing P -//#define GRAPHICS // run FluidX3D in the console, but still enable graphics functionality for writing rendered frames to the hard drive +#define GRAPHICS // run FluidX3D in the console, but still enable graphics functionality for writing rendered frames to the hard drive

define GRAPHICS_FRAME_WIDTH 1920 // set frame width if only GRAPHICS is enabled

define GRAPHICS_FRAME_HEIGHT 1080 // set frame height if only GRAPHICS is enabled

@@ -98,4 +98,4 @@

if defined(INTERACTIVE_GRAPHICS) || defined(INTERACTIVE_GRAPHICS_ASCII)

define GRAPHICS

define UPDATE_FIELDS // to prevent flickering artifacts in interactive graphics

-#endif // INTERACTIVE_GRAPHICS || INTERACTIVE_GRAPHICS_ASCII \ No newline at end of file +#endif // INTERACTIVE_GRAPHICS || INTERACTIVE_GRAPHICS_ASCII

setup.cpp diff >>

azureuser@ip-0A010004:~/lbm/FluidX3D$ git diff src/setup.cpp diff --git a/src/setup.cpp b/src/setup.cpp index 5edc9de..11e7662 100644 --- a/src/setup.cpp +++ b/src/setup.cpp @@ -4,7 +4,7 @@

ifdef BENCHMARK

include "info.hpp"

-void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK, optionally FP16S or FP16C +/*void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK, optionally FP16S or FP16C // ################################################################## define simulation box size, viscosity and volume force ################################################################### uint mlups = 0u; {

@@ -570,7 +570,7 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,

-/*void main_setup() { // aerodynamics of a cow; required extensions in defines.hpp: FP16S, EQUILIBRIUM_BOUNDARIES, SUBGRID, INTERACTIVE_GRAPHICS or GRAPHICS +void main_setup() { // aerodynamics of a cow; required extensions in defines.hpp: FP16S, EQUILIBRIUM_BOUNDARIES, SUBGRID, INTERACTIVE_GRAPHICS or GRAPHICS // ################################################################## define simulation box size, viscosity and volume force ################################################################### const uint3 lbm_N = resolution(float3(1.0f, 2.0f, 1.0f), 1000u); // input: simulation box aspect ratio and VRAM occupation in MB, output: grid resolution const float si_u = 1.0f; @@ -1291,4 +1291,4 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK, lbm.graphics.visualization_modes = VIS_FLAG_LATTICE|VIS_STREAMLINES; lbm.run(); //lbm.run(1000u); lbm.u.read_from_device(); println(lbm.u.x[lbm.index(Nx/2u, Ny/2u, Nz/2u)]); wait(); // test for binary identity -} // \ No newline at end of file +} //

I hope that helps.

Kind regards, Tom

justAPhDStudent commented 1 year ago

Hello,

I was on g++-8 (as per your repo documentation). I tried g++-11 and the issue is gone. I didn't try g++-9 or g++-10. Straight away went to g++-11.

The code now runs>>

| Info: OpenCL C code successfully compiled. | | Info: Allocating memory. This may take a few seconds. | | Info: Loading "/shared/home/azureuser/lbm/FluidX3D/bin/../stl/Cow_t.stl" | | with 99994 triangles. | |-----------------.-----------------------------------------------------------| | Grid Resolution | 212 x 424 x 212 = 19056256 | | Grid Domains | 1 x 1 x 1 = 1 | | LBM Type | D3Q19 SRT (FP32/FP16S) | | Memory Usage | CPU 308 MB, GPU 1x 1012 MB | | Max Alloc Size | 690 MB | | Time Steps | 0 | | Kin. Viscosity | 0.00016995 | | Relaxation Time | 0.50050984 | | Reynolds Number | Re < 720188 | |---------.-------'-----.-----------.-------------------.---------------------| | MLUPs | Bandwidth | Steps/s | Current Step | Time Remaining | | 3756 | 289 GB/s | 197 | 11481 0% | 0s | |---------'-------------'-----------'-------------------'---------------------| | Info: Finishing encoder threads: 4 | | Info: Finishing encoder threads: 3 | | Info: Finishing encoder threads: 2 | | Info: Finishing encoder threads: 1 | | Info: Finishing encoder threads: 0

If you want I can test 9,10 and report it here. For the moment. this issue is resolved for me.

Thanks for your responses.

Have a great day!

Kind regards, Tom P.S.: <after testing on g++-9> For anyone interested in the g++ version. The program runs for the reported issue even from g++-9. Just an FYI!

ProjectPhysX commented 1 year ago

Nice! Strange that it compiled at all with g++-8. It should complain about <filesystem> missing.

ProjectPhysX / FluidX3D

Application segfaults with g++-8 #115

define SRT // choose single-relaxation-time LBM collision operator; (default)

define GRAPHICS_FRAME_WIDTH 1920 // set frame width if only GRAPHICS is enabled

define GRAPHICS_FRAME_HEIGHT 1080 // set frame height if only GRAPHICS is enabled

if defined(INTERACTIVE_GRAPHICS) || defined(INTERACTIVE_GRAPHICS_ASCII)

define GRAPHICS

define UPDATE_FIELDS // to prevent flickering artifacts in interactive graphics

ifdef BENCHMARK

include "info.hpp"