Closed TomTomTommi closed 1 month ago
coarse
is a CPU only job which generates latyouts, and only this job seems to be running. The system seems to be crashing on fineterrain
and hence is never getting to any jobs which actually use the GPU.
Can you provide the contents of crash_summaries.txt
and the .out
and .err
files for any crashed jobs? You can always check on these files to determine what is actually going on during the jobs - currently it isnt making any progress.
My guess would be something went wrong with installing cuda terrain.
The error is 'GLIBCXX_3.4.29 not found'. The crash_summaries.txt
is attached as below.
crash_summaries.txt
the content of one .out
file is
/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
return self._float_to_str(self.smallest_subnormal)
/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
return self._float_to_str(self.smallest_subnormal)
[10:18:20.075] [root] [WARNING] | SMB_AUTH envvar is not set, smb_client upload will not work. Ignore this message if not using upload
[10:18:20.083] [infinigen.core.init] [INFO] | Converted seed='4d9f3558' to scene_seed=1302279512, parsed as hexadecimal
[10:18:20.098] [infinigen.core.execute_tasks] [INFO] | infinigen version 1.2.5
[10:18:20.098] [infinigen.core.execute_tasks] [INFO] | CUDA_VISIBLE_DEVICES=
[10:18:20.098] [infinigen.times] [INFO] | [MAIN TOTAL]
[10:18:20.098] [infinigen.core.execute_tasks] [INFO] | Processing frames 1 through 192 inclusive
[10:18:20.104] [infinigen.times] [INFO] | [terrain]
[10:18:20.104] [infinigen.times] [INFO] | [Create terrain]
[10:18:20.104] [infinigen.terrain.core] [INFO] | Terrain using only on the fly on_the_fly_asset_folder=PosixPath('/home/jj323/PycharmProjects/infinigen/outputs/my_videos/4d9f3558/coarse/assets')
[10:26:33.654] [infinigen.times] [INFO] | [Create terrain] failed with <class 'OSError'>
[10:26:33.654] [infinigen.times] [INFO] | [terrain] failed with <class 'OSError'>
[10:26:33.654] [infinigen.times] [INFO] | [MAIN TOTAL] failed with <class 'OSError'>
Traceback (most recent call last):
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/jj323/PycharmProjects/infinigen/infinigen_examples/generate_nature.py", line 438, in <module>
main(args)
File "/home/jj323/PycharmProjects/infinigen/infinigen_examples/generate_nature.py", line 409, in main
execute_tasks.main(
File "/home/jj323/PycharmProjects/infinigen/infinigen/core/execute_tasks.py", line 418, in main
execute_tasks(
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/core/execute_tasks.py", line 328, in execute_tasks
compose_scene_func(output_folder, scene_seed)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen_examples/generate_nature.py", line 79, in compose_scene
terrain, terrain_mesh = p.run_stage('terrain', add_coarse_terrain, use_chance=False, default=(None, None))
File "/home/jj323/PycharmProjects/infinigen/infinigen/core/util/pipeline.py", line 76, in run_stage
ret = fn(*args, **kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen_examples/generate_nature.py", line 75, in add_coarse_terrain
terrain = Terrain(scene_seed, surface.registry, task='coarse', on_the_fly_asset_folder=output_folder/"assets")
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/core.py", line 126, in __init__
self.elements, scene_infos = scene(seed, Path(on_the_fly_asset_folder), asset_path, device)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/scene.py", line 56, in scene
elements[ElementNames.LandTiles] = LandTiles(device, caves, on_the_fly_asset_folder, reused_asset_folder)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/elements/landtiles.py", line 97, in __init__
n_instances, tile_size, N, float_data = self.load_assets()
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/elements/landtiles.py", line 130, in load_assets
landtile_asset(self.on_the_fly_asset_folder / tile / f"{i}", tile, device=self.device)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/assets/landtiles/core.py", line 138, in landtile_asset
multi_mountains_asset(folder, tile_size, resolution, device)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/assets/landtiles/custom.py", line 153, in multi_mountains_asset
if erosion: run_erosion(folder)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/land_process/erosion.py", line 31, in run_erosion
dll = load_cdll("terrain/lib/cpu/soil_machine/SoilMachine.so")
File "/home/jj323/PycharmProjects/infinigen/infinigen/terrain/utils/ctype_util.py", line 29, in load_cdll
return CDLL(root/path, mode=RTLD_LOCAL)
File "/home/jj323/anaconda3/envs/infinigen/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/jj323/PycharmProjects/infinigen/infinigen/terrain/lib/cpu/soil_machine/SoilMachine.so)
In call to configurable 'run_erosion' (<function run_erosion at 0x7f3362bc48b0>)
In call to configurable 'multi_mountains_asset' (<function multi_mountains_asset at 0x7f335f96acb0>)
In call to configurable 'load_assets' (<function LandTiles.load_assets at 0x7f335f952ef0>)
In call to configurable 'LandTiles' (<class 'infinigen.terrain.elements.landtiles.LandTiles'>)
In call to configurable 'scene' (<function scene at 0x7f336a7c8820>)
In call to configurable 'Terrain' (<class 'infinigen.terrain.core.Terrain'>)
In call to configurable 'compose_scene' (<function compose_scene at 0x7f343110ab00>)
In call to configurable 'execute_tasks' (<function execute_tasks at 0x7f335f499bd0>)
The output of strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX
is
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBCXX_3.4.21
GLIBCXX_3.4.22
GLIBCXX_3.4.23
GLIBCXX_3.4.24
GLIBCXX_3.4.25
GLIBCXX_3.4.26
GLIBCXX_3.4.27
GLIBCXX_3.4.28
GLIBCXX_DEBUG_MESSAGE_LENGTH
Following the command
currently the output is as follows:
It is still running. I wonder if this is correct since it has been running for 15 hours. Why are the crashes so high? Plus, according to other issues, the GPU consumption should be about 20GB. But I found my GPU usage is quite low.
The CPU usage is full.
I am sure to involve
cuda_terrain
in the command. What is the problem?Platform