hovsg / HOV-SG

[RSS2024] Official implementation of "Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation"
https://hovsg.github.io
MIT License
157 stars 11 forks source link

create_graph.py crashes after some time #8

Open ADetilie opened 1 month ago

ADetilie commented 1 month ago
          @abwerby thank you for quick fix! I don't encounter such problems right now.

However, when I am trying to generate scene graph, I repeatedly face crashes:

(hovsg) ➜  HOV-SG git:(main) ✗ python application/create_graph.py main.dataset=hm3dsem main.dataset_path=data/hm3dsem_walks/val/00824-Dd4bFSTQ8gi/ main.save_path=data/scene_graphs/00824-Dd4bFSTQ8gi
/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/scipy/__init__.py:155: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
[2024-07-23 12:37:18,487][root][INFO] - Loaded ViT-H-14 model config.
[2024-07-23 12:37:26,890][root][INFO] - Loading pretrained ViT-H-14 weights (checkpoints/laion2b_s32b_b79k.bin).
Creating RGB-D point cloud: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 226/226 [00:26<00:00,  8.58it/s]
Extracting features: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 226/226 [20:37<00:00,  5.47s/it]
Merging 3d masks sequentially
  6%|██████████                                                                                                                                                        | 14/225 [25:23<19:20:49, 330.09s/it][1]    100314 killed     python application/create_graph.py main.dataset=hm3dsem  

Traces show that some processes were killed on my machine, due to Out of memory problem. Despite the machine having 62 gigs of RAM. image (5)

What are the memory requirements for this script to run properly?

Originally posted by @ADetilie in https://github.com/hovsg/HOV-SG/issues/6#issuecomment-2245406682

ADetilie commented 1 month ago
Screenshot 2024-07-23 at 19 33 42

There are RAM usage spikes during script work. This script just ate 120 gig of RAM and failed with Out of memory. Are there any configurations of the project which allow reduce memory usage? Or maybe this is some meory leak?

abwerby commented 1 month ago

7 check this issue.

ADetilie commented 1 month ago

UPD 24.07.2024: fixed the following problem (matplotlib exception) by reinstalling matplotlib.

pip uninstall matplotlib
conda install matplotlib==3.7.2

Still looking for updates regarding Out Of Memory issue.


Thanks. I hadn't Out of memory crash this time, but faced the following exception:

-- removing small and empty masks --
number of masked pcds:  1202
number of mask_feats:  1202
masked pcds saved to disk in data/scene_graphs_new/00824-Dd4bFSTQ8gi/hm3dsem/00824-Dd4bFSTQ8gi
full pcd saved to disk in data/scene_graphs_new/00824-Dd4bFSTQ8gi/hm3dsem/00824-Dd4bFSTQ8gi
full pcd feats saved to disk in data/scene_graphs_new/00824-Dd4bFSTQ8gi/hm3dsem/00824-Dd4bFSTQ8gi
hm3dsem
segmenting floors...
downpcd (294355, 3)
bins 281.20046911000327
distance 20.0
1047.017793594306
min_peak_height 1490.0
clustred_peaks [0.00603302 2.43776661]
floors [[0.006033019964371361, 2.4377666140473178]]
number of floors:  1
segmenting rooms...
grid_size:  0.05
occupancy_map shape:  (340, 340)
range of dist:  0.0 48.0
Error executing job with overrides: ['main.dataset=hm3dsem', 'main.dataset_path=data/hm3dsem_walks_new/val/00824-Dd4bFSTQ8gi/', 'main.save_path=data/scene_graphs_new/00824-Dd4bFSTQ8gi']
Traceback (most recent call last):
  File "/home/ncdev/repos/HOV-SG/application/create_graph.py", line 34, in main
    hovsg.build_graph(save_path=save_dir)
  File "/home/ncdev/repos/HOV-SG/hovsg/graph/graph.py", line 814, in build_graph
    self.segment_rooms(floor, save_path)
  File "/home/ncdev/repos/HOV-SG/hovsg/graph/graph.py", line 489, in segment_rooms
    room_vertices = distance_transform(full_map, resolution, tmp_floor_path)
  File "/home/ncdev/repos/HOV-SG/hovsg/utils/graph_utils.py", line 236, in distance_transform
    plt.figure()
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/pyplot.py", line 1027, in figure
    manager = new_figure_manager(
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/pyplot.py", line 550, in new_figure_manager
    return _get_backend_mod().new_figure_manager(*args, **kwargs)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 3507, in new_figure_manager
    return cls.new_figure_manager_given_figure(num, fig)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 3512, in new_figure_manager_given_figure
    return cls.FigureCanvas.new_manager(figure, num)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 1797, in new_manager
    return cls.manager_class.create_with_canvas(cls, figure, num)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backends/_backend_tk.py", line 504, in create_with_canvas
    manager = cls(canvas, num, window)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backends/_backend_tk.py", line 457, in __init__
    super().__init__(canvas, num)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2655, in __init__
    self.toolbar = self._toolbar2_class(self.canvas)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backends/_backend_tk.py", line 649, in __init__
    NavigationToolbar2.__init__(self, canvas)
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2850, in __init__
    self._nav_stack = cbook._Stack()
  File "/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/matplotlib/_api/__init__.py", line 217, in __getattr__
    raise AttributeError(
AttributeError: module 'matplotlib.cbook' has no attribute '_Stack'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Eku127 commented 1 month ago
          @abwerby thank you for quick fix! I don't encounter such problems right now.

However, when I am trying to generate scene graph, I repeatedly face crashes:

(hovsg) ➜  HOV-SG git:(main) ✗ python application/create_graph.py main.dataset=hm3dsem main.dataset_path=data/hm3dsem_walks/val/00824-Dd4bFSTQ8gi/ main.save_path=data/scene_graphs/00824-Dd4bFSTQ8gi
/home/ncdev/miniforge3/envs/hovsg/lib/python3.9/site-packages/scipy/__init__.py:155: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
[2024-07-23 12:37:18,487][root][INFO] - Loaded ViT-H-14 model config.
[2024-07-23 12:37:26,890][root][INFO] - Loading pretrained ViT-H-14 weights (checkpoints/laion2b_s32b_b79k.bin).
Creating RGB-D point cloud: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 226/226 [00:26<00:00,  8.58it/s]
Extracting features: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 226/226 [20:37<00:00,  5.47s/it]
Merging 3d masks sequentially
  6%|██████████                                                                                                                                                        | 14/225 [25:23<19:20:49, 330.09s/it][1]    100314 killed     python application/create_graph.py main.dataset=hm3dsem  

Traces show that some processes were killed on my machine, due to Out of memory problem. Despite the machine having 62 gigs of RAM. image (5)

What are the memory requirements for this script to run properly?

Originally posted by @ADetilie in #6 (comment)

Hi! I encountered the same issue with the process being killed during the merging of 3D masks. Will reinstalling Matplotlib solve this problem? Additionally, could you let me know which tool you used for tracing the executable? Was it perf?

Merging 3d masks sequentially
  3%|████▌                                                                                                                                                                               | 5/199 [02:44<2:51:52, 53.16s/it]
[1]    3181608 killed     python application/semantic_segmentation.py main.dataset=replica  
ADetilie commented 1 month ago

@Eku127, what you are seeing is most likely an "Out of Memory" problem. You can use workaround by changing "merge_type" to "hierarchical" in the config. See this comment from the contributor for more details: #8 (comment). -> This workaround worked for me... but I have 128 gig of RAM on my machine AND turned off oomd service

Reinstalling matplotlib won't help with this particular problem. It fixes a different bug in the system. So, there's no need to reinstall matplotlib unless you encounter the "AttributeError: module 'matplotlib.cbook' has no attribute '_Stack'" exception.

Eku127 commented 1 month ago

@Eku127, what you are seeing is most likely an "Out of Memory" problem. You can use workaround by changing "merge_type" to "hierarchical" in the config. See this comment from the contributor for more details: #8 (comment). -> This workaround worked for me... but I have 128 gig of RAM on my machine AND turned off oomd service

Reinstalling matplotlib won't help with this particular problem. It fixes a different bug in the system. So, there's no need to reinstall matplotlib unless you encounter the "AttributeError: module 'matplotlib.cbook' has no attribute '_Stack'" exception.

Got it. All I need is more RAM :rofl:

However, I think there's still a lot of work needed to optimize RAM usage.