ProjectPhysX / FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL. Free for non-commercial use.
https://youtube.com/@ProjectPhysX
Other
3.81k stars 301 forks source link

GPU does not have enough memory. How to change gpu memory to use? #46

Closed kayrailia closed 1 year ago

kayrailia commented 1 year ago

Hi everyone, i'm i high school student and i want to improve myself in cfd. I recently saw FluidX3D and it looks really cool. But i get error when i compile it as WINDOWS_GRAPHICS, i get memory usage error. But i have no idea to how to change the memory that cfd uses. I'm really a beginner so please take it easy.

CFD Windows is just black. And the console is just gives that error;

Error: Device "NVIDIA GeForce GTX 1050 Ti" does not have enough memory. | | Allocating another 8680 MB would use a total of 11573 MB / 4095 MB. | | Press Enter to exit.

Changings I made to the src:

defines.hpp:
comment #define BENCHMARK uncomment #define WINDOWS_GRAPHICS

setup.cpp :
uncomment Boeing 757 setup

This is what my setup.cpp file looks like;

void main_setup() { // Boeing 757
    // ######################################################### define simulation box size, viscosity and volume force ############################################################################
    const uint L = 912u;
    const float Re = 100000.0f;
    const float u = 0.125f;
    LBM lbm(L, 2u*L, L/2u, units.nu_from_Re(Re, (float)L, u));
    // #############################################################################################################################################################################################
    const float size = 1.1f*(float)L;
    const float3 center = float3(lbm.center().x, 32.0f+0.5f*size, lbm.center().z);
    const float3x3 rotation = float3x3(float3(1, 0, 0), radians(75.0f));
    lbm.voxelize_stl(get_exe_path()+"../stl/757.stl", center, rotation, size); // https://www.thingiverse.com/thing:5091064/files
    const uint N=lbm.get_N(), Nx=lbm.get_Nx(), Ny=lbm.get_Ny(), Nz=lbm.get_Nz(); for(uint n=0u, x=0u, y=0u, z=0u; n<N; n++, lbm.coordinates(n, x, y, z)) {
        // ########################################################################### define geometry #############################################################################################
        if(lbm.flags[n]!=TYPE_S) lbm.u.y[n] = u;
        if(x==0u||x==Nx-1u||y==0u||y==Ny-1u||z==0u||z==Nz-1u) lbm.flags[n] = TYPE_E; // all non periodic
    }   // #########################################################################################################################################################################################
    key_4 = true;
    Clock clock;
    lbm.run(0u);
    while(lbm.get_t()<100000u) {
        lbm.graphics.set_camera_free(float3(1.0f*(float)Nx, -0.4f*(float)Ny, 2.0f*(float)Nz), -33.0f, 42.0f, 68.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/t/");
        lbm.graphics.set_camera_free(float3(0.5f*(float)Nx, -0.35f*(float)Ny, -0.7f*(float)Nz), -35.0f, -35.0f, 100.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/b/");
        lbm.graphics.set_camera_free(float3(0.0f*(float)Nx, 0.51f*(float)Ny, 0.75f*(float)Nz), 90.0f, 28.0f, 80.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/f/");
        lbm.graphics.set_camera_free(float3(0.6f*(float)Nx, -0.15f*(float)Ny, 0.06f*(float)Nz), 0.0f, 0.0f, 100.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/s/");
        lbm.run(28u);
    }
    write_file(get_exe_path()+"time.txt", print_time(clock.stop()));
    lbm.run();
    }
ProjectPhysX commented 1 year ago

Hi @kayrailia, you need to reduce grid resolution then. In the main_setup() function, when the lbm object is created, the LBM lbm(resolution_x, resolution_y, resolution_z, kinematic_shear_ciscosity); constructor takes the resolution as input. In this case, resolution is set via the const uint L = 912u; parameter. Reduce that to maybe const uint L = 256u;.

Note that in this particular setup, you have to enable the #define SUBGRID extension or it will not run stable. Also note that by enabling #define FP16S, the memory footprint is almost halved, so you can set ~20% larger lateral grid resolution.

Have fun with the software!

kayrailia commented 1 year ago

That solved my problem! But now i have another problem. I know that it is really dumb question but since i don't exactly know cpp, i have to ask this.

I get these 2 errors at the same time even i have a C:/fx/stl/StarshipV1.stl file. (FluidX3D.exe is at C:\fx) [I tried with 4 diffrent stl files, 3 of them was the ones that you use in your videos and at setup.cpp]

Error: File "C:/fx/../stl/StarshipV1.stl" does not exist! Error: File "C:/fx/../stl/StarshipV1.stl" is corrupt or unsupported! Only binary .stl files are supported.

But the CFD Windows still opens and only thing i see is external volume i guess? (green rectangle) and some parameters like fps...

setup.cpp file;

void main_setup() { // Star Wars X-wing
    // ######################################################### define simulation box size, viscosity and volume force ############################################################################
    const uint L = 256u;
    const float Re = 100000.0f;
    const float u = 0.125f;
    LBM lbm(L, L*2u, L/2u, units.nu_from_Re(Re, (float)L, u));
    // #############################################################################################################################################################################################
    const float size = 1.0f*(float)L;
    const float3 center = float3(lbm.center().x, 32.0f+0.5f*size, lbm.center().z);
    const float3x3 rotation = float3x3(float3(0, 0, 1), radians(180.0f));
    voxelize_stl_hull(lbm, get_exe_path()+"../stl/StarshipV1.stl", center, rotation, size); // https://www.thingiverse.com/thing:353276/files
    const uint N=lbm.get_N(), Nx=lbm.get_Nx(), Ny=lbm.get_Ny(), Nz=lbm.get_Nz(); for(uint n=0u, x=0u, y=0u, z=0u; n<N; n++, lbm.coordinates(n, x, y, z)) {
        // ########################################################################### define geometry #############################################################################################
        if(lbm.flags[n]!=TYPE_S) lbm.u.y[n] = u;
        if(x==0u||x==Nx-1u||y==0u||y==Ny-1u||z==0u||z==Nz-1u) lbm.flags[n] = TYPE_E; // all non periodic
    }   // #########################################################################################################################################################################################
    key_4 = true;
    Clock clock;
    lbm.run(0u);
    while(lbm.get_t()<50000u) {
        lbm.graphics.set_camera_free(float3(1.0f*(float)Nx, -0.4f*(float)Ny, 2.0f*(float)Nz), -33.0f, 42.0f, 68.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/t/");
        lbm.graphics.set_camera_free(float3(0.5f*(float)Nx, -0.35f*(float)Ny, -0.7f*(float)Nz), -33.0f, -40.0f, 100.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/b/");
        lbm.graphics.set_camera_free(float3(0.0f*(float)Nx, 0.51f*(float)Ny, 0.75f*(float)Nz), 90.0f, 28.0f, 80.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/f/");
        lbm.graphics.set_camera_free(float3(0.7f*(float)Nx, -0.15f*(float)Ny, 0.06f*(float)Nz), 0.0f, 0.0f, 100.0f);
        lbm.graphics.write_frame_png(get_exe_path()+"export/s/");
        lbm.run(28u);
    }
    write_file(get_exe_path()+"time.txt", print_time(clock.stop()));
    lbm.run();
} /**/

defines.hpp

#pragma once

//#define D2Q9 // choose D2Q9 velocity set for 2D; allocates 53 (FP32) or 35 (FP16) Bytes/node
//#define D3Q15 // choose D3Q15 velocity set for 3D; allocates 77 (FP32) or 47 (FP16) Bytes/node
#define D3Q19 // choose D3Q19 velocity set for 3D; allocates 93 (FP32) or 55 (FP16) Bytes/node; (default)
//#define D3Q27 // choose D3Q27 velocity set for 3D; allocates 125 (FP32) or 71 (FP16) Bytes/node

#define SRT // choose single-relaxation-time LBM collision operator; (default)
//#define TRT // choose two-relaxation-time LBM collision operator

#define FP16S // compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32
//#define FP16C // compress LBM DDFs to more accurate custom FP16C format; number conversion is emulated in software; all arithmetic is still done in FP32

//#define BENCHMARK // disable all extensions and setups and run benchmark setup instead

//#define VOLUME_FORCE // enables global force per volume in one direction, specified in the LBM class constructor; the force can be changed on-the-fly between time steps at no performance cost
//#define FORCE_FIELD // enables computing the forces on solid boundaries with lbm.calculate_force_on_boundaries(); and enables setting the force for each lattice point independently (enable VOLUME_FORCE too); allocates an extra 12 Bytes/node
//#define MOVING_BOUNDARIES // enables moving solids: set solid nodes to TYPE_S and set their velocity u unequal to zero
#define EQUILIBRIUM_BOUNDARIES // enables fixing the velocity/density by marking nodes with TYPE_E; can be used for inflow/outflow; does not reflect shock waves
//#define SURFACE // enables free surface LBM: mark fluid nodes with TYPE_F; at initialization the TYPE_I interface and TYPE_G gas domains will automatically be completed; allocates an extra 12 Bytes/node
//#define TEMPERATURE // enables temperature extension; set fixed-temperature nodes with TYPE_T (similar to EQUILIBRIUM_BOUNDARIES); allocates an extra 32 (FP32) or 18 (FP16) Bytes/node
//#define SUBGRID // enables Smagorinsky-Lilly subgrid turbulence model to keep simulations with very large Reynolds number stable

#define WINDOWS_GRAPHICS // enable interactive graphics in Windows; start/pause the simulation by pressing P
//define CONSOLE_GRAPHICS // enable interactive graphics in the console; start/pause the simulation by pressing P
//#define GRAPHICS // run FluidX3D in the console, but still enable graphics functionality for writing rendered frames to the hard drive

#define GRAPHICS_FRAME_WIDTH 3840 // set frame width if only GRAPHICS is enabled
#define GRAPHICS_FRAME_HEIGHT 2160 // set frame height if only GRAPHICS is enabled
#define GRAPHICS_BACKGROUND_COLOR 0x000000 // set background color; black background (default) = 0x000000, white background = 0xFFFFFF
#define GRAPHICS_U_MAX 0.15f // maximum velocity for velocity coloring in units of LBM lattice speed of sound (c=1/sqrt(3)) (default: 0.15f)
#define GRAPHICS_Q_CRITERION 0.0001f // Q-criterion value for Q-criterion isosurface visualization (default: 0.0001f)
#define GRAPHICS_BOUNDARY_FORCE_SCALE 100.0f // scaling factor for visualization of forces on solid boundaries if VOLUME_FORCE is enabled and lbm.calculate_force_on_boundaries(); is called (default: 100.0f)
#define GRAPHICS_STREAMLINE_SPARSE 4 // set how many streamlines there are every x lattice points
#define GRAPHICS_STREAMLINE_LENGTH 128 // set maximum length of streamlines

// #############################################################################################################

#define TYPE_S 0b00000001 // (stationary or moving) solid boundary
#define TYPE_E 0b00000010 // equilibrium boundary (inflow/outflow)
#define TYPE_T 0b00000100 // temperature boundary
#define TYPE_F 0b00001000 // fluid
#define TYPE_I 0b00010000 // interface
#define TYPE_G 0b00100000 // gas
#define TYPE_X 0b01000000 // reserved type X
#define TYPE_Y 0b10000000 // reserved type Y

#if defined(FP16S) || defined(FP16C)
#define fpxx ushort
#else // FP32
#define fpxx float
#endif // FP32
#define SUBGRID
#ifdef BENCHMARK
#undef UPDATE_FIELDS
#undef VOLUME_FORCE
#undef FORCE_FIELD
#undef MOVING_BOUNDARIES
#undef EQUILIBRIUM_BOUNDARIES
#undef SURFACE
#undef TEMPERATURE

#undef WINDOWS_GRAPHICS
#undef CONSOLE_GRAPHICS
#undef GRAPHICS
#endif // BENCHMARK

#ifdef SURFACE // (rho, u) need to be updated exactly every LBM step
#define UPDATE_FIELDS // update (rho, u, T) in every LBM step
#endif // SURFACE

#ifdef TEMPERATURE
#define VOLUME_FORCE
#endif // TEMPERATURE

#ifdef WINDOWS_GRAPHICS
#define GRAPHICS
#endif // WINDOWS_GRAPHICS
#ifdef CONSOLE_GRAPHICS
#define GRAPHICS
#endif // CONSOLE_GRAPHICS
ProjectPhysX commented 1 year ago

The STL file containing the model geometry is not included in the repository, and is missing here. Download the file https://www.thingiverse.com/thing:353276/files and place it in the newly created folder stl/tarshipV1.stl.

kayrailia commented 1 year ago

Thanks, but as i said, i already have the stl file downloaded and in the /stl/ folder. Results are same.

This is what does it looks like; image

image

ProjectPhysX commented 1 year ago

The stl folder has to be in the FluidX3D folder, next to the src folder!