RobertBeckebans / RBDOOM-3-BFG

Doom 3 BFG Edition source port with updated DX12 / Vulkan renderer and modern game engine features
https://www.moddb.com/mods/rbdoom-3-bfg
GNU General Public License v3.0
1.44k stars 252 forks source link

baking light grid process recieving SIGKILL #558

Open BielBdeLuna opened 3 years ago

BielBdeLuna commented 3 years ago

while baking the mars_city map I get a SIGKILL without further trace

in area 2 of 56, it has a huge number of grid points 15895 ( to close to the limit maybe? )

]bakeLightGrids 
writing to: /home/biel/.local/share/rbdoom3bfg/base/consoleHistory.txt
Using limit = 16384
Using bounces = 1
Preferred lightGridSize (64 64 128)

area 0 of 56 (21 x 18 x 5) = 1890 grid points 
area 0 grid size (64 64 128)
area 0 grid bounds (21 18 5)
area 0: 0 of 1890 grid points in empty space (0.00%)
Shooting 1890 grid probes in area 0...
0%  10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
captured light grid radiance for area 0 in  17.1 seconds

Processing probes on all available cores... Please wait.
writing to: /mnt/nvme1n1p1/games/doom/doom3/doom3bfg/base/env/maps/game/mars_city1/area0_lightgrid_amb.exr
computed light grid irradiance for area 0 in  10.4 seconds

area 1 of 56 (17 x 17 x 55) = 15895 grid points 
area 1 grid size (160 144 208)
area 1 grid bounds (17 17 55)
area 1: 1011 of 15895 grid points in empty space (6.36%)
Shooting 14884 grid probes in area 1...
0%  10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
[Thread 0x7fffa57fa640 (LWP 12586) exited]
[Thread 0x7fff87334640 (LWP 12585) exited]
[Thread 0x7fffa7dfe640 (LWP 12584) exited]
[Thread 0x7fffbc9b8640 (LWP 12582) exited]
[Thread 0x7fffe3f38640 (LWP 12577) exited]
[Thread 0x7fffe4739640 (LWP 12576) exited]
[Thread 0x7fffe4f3a640 (LWP 12575) exited]
[Thread 0x7fffe573b640 (LWP 12574) exited]
[Thread 0x7fffe5f3c640 (LWP 12573) exited]
[Thread 0x7fffe673d640 (LWP 12572) exited]
[Thread 0x7fffe6f3e640 (LWP 12571) exited]
[Thread 0x7fffe773f640 (LWP 12570) exited]
[Thread 0x7fffe7f40640 (LWP 12569) exited]
[Thread 0x7fffe8741640 (LWP 12568) exited]
[Thread 0x7fffe8f42640 (LWP 12567) exited]
[Thread 0x7fffe9743640 (LWP 12566) exited]
[Thread 0x7fffe9f44640 (LWP 12565) exited]
[Thread 0x7fffea745640 (LWP 12564) exited]
[Thread 0x7fffeaf46640 (LWP 12563) exited]
[Thread 0x7fffeb747640 (LWP 12562) exited]
[Thread 0x7fffebf48640 (LWP 12561) exited]
[Thread 0x7fffec749640 (LWP 12560) exited]
[Thread 0x7fffecf4a640 (LWP 12559) exited]
[Thread 0x7fffed74b640 (LWP 12558) exited]
[Thread 0x7fffedf4c640 (LWP 12557) exited]
[Thread 0x7fffee74d640 (LWP 12556) exited]
[Thread 0x7fffeef4e640 (LWP 12555) exited]
[Thread 0x7fffef74f640 (LWP 12554) exited]
[Thread 0x7fffeff50640 (LWP 12553) exited]
[Thread 0x7ffff0751640 (LWP 12552) exited]
[Thread 0x7ffff0f52640 (LWP 12551) exited]
[Thread 0x7ffff1753640 (LWP 12550) exited]
[Thread 0x7ffff1f54640 (LWP 12549) exited]
[Thread 0x7ffff2755640 (LWP 12548) exited]
[Thread 0x7ffff2f56640 (LWP 12547) exited]
[Thread 0x7ffff3757640 (LWP 12546) exited]
[Thread 0x7ffff6847800 (LWP 12542) exited]
--Type <RET> for more, q to quit, c to continue without paging--ret

Program terminated with signal SIGKILL, Killed.
DanielGibson commented 3 years ago

Maybe you ran out of main memory so the OOM killer kicked in? => run rbdoom3bfg in windowed mode and keep an eye memory consumption while running the command. top or htop should help.

and probably dmesg or the syslog will have an entry if the OOM killer was active

RobertBeckebans commented 3 years ago

Actually there is no real fixed size limit like in Quake 3 even there MAX_AREA_LIGHTGRID_POINTS. It's only a suggestion or default limit which can be overridden by bakeLightGrids limit1000 or bakeLightGrids limit20000. However you probably ran out of memory by allocating 15895 cubemaps with the size of 256256632 (RGB16F) which is 35763.75 MB. I still need to break up the algorithm that it only captures 2000 probes in a single step.

BielBdeLuna commented 3 years ago

yeah, my computer has 16 Gigs and since it's using all the cores the computer locks up while it's working, by that time the RAM usage of RBDoom3BFG is 13Gb so I bet when it reaches around max ram it fails out.

maybe when the area of the light grid is too great maybe the part could be separated in two parts, create two different atlases, and then add them in a single larger atlas, only once you've finished baking them up, and the memory is flushed, so you could have n atlases per light grid area, and so have light grid areas that are larger with lots of grid points.

edit--- with the limit set to 6000 is my limit for the mars_city, with it RBDoom3BFG never goes over 12GBs of RAM, since I'm seeing Hell1 is the biggest light grid you distributed I'm going to find the best Limit value for Hell1 and 16 GBs of RAM

edit 2 --- for Hell I have to go as low as limit set to 5000, but Hell1 having the larger file doesn't mean that it represents anything, it might aggregate to the biggest file because it has so many small rooms full of grid points, but the problem of RAM is a per-room problem, so the bigger the room the more grid points, maybe Enpro has bigger rooms? or maybe the Hell_hole cyberdemon arena is the biggest room? or maybe the LE hell maps?

edit 3-- I think the biggest map in terms of having lots of big rooms is le_hell map, and I could bake the light grid with a limit 5000 using 16GBs of memory

BielBdeLuna commented 3 years ago

the sigkill happens during the envprobe image capture, why not save every row of the lightgrid (as I see those get capture in rows or stages instead of collumns ) as a separate exr image, and once captured, splice all the row exr images in a single exr for the whole lightgrid, and then do the convolution as normal, therefore we should be able to capture higher lightgrid density and be less prone to run out of memory due that we don't keep it all in RAM at the same time but in disk.

mrcmunir commented 2 years ago

I can confirm this i think its related with some new structure stuff Cannot run anymore on a board with 2GB doom3 memory cause SIGKILL in loading screen but on the same board with 4GB can load fine . This problem is not affected by DOOM1 or DOOM2.