princeton-vl / DROID-SLAM

BSD 3-Clause "New" or "Revised" License
1.66k stars 273 forks source link

Not expected output when running demo #76

Closed jugwangjin closed 1 year ago

jugwangjin commented 1 year ago

Hi, thank you for this great work.

I have met a problem while runnning the demo code.

Running python demo.py --imagedir=data/abandonedfactory/ --calib=calib/t artan.txt --stride=4 outputs:

################################ Traceback (most recent call last): File "/home/gwangjin/DROID-SLAM/demo.py", line 134, in <module> traj_est = droid.terminate(image_stream(args.imagedir, args.calib, args.stride)) File "/home/gwangjin/DROID-SLAM/droid_slam/droid.py", line 81, in terminate self.backend(7) File "/opt/conda/envs/droidenv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/gwangjin/DROID-SLAM/droid_slam/droid_backend.py", line 34, in __call__ graph.add_proximity_factors(rad=self.backend_radius, File "/home/gwangjin/DROID-SLAM/droid_slam/factor_graph.py", line 374, in add_proximity_factors ii, jj = torch.as_tensor(es, device=self.device).unbind(dim=-1) ValueError: not enough values to unpack (expected 2, got 0)

And I figured out why it happens.

The feature encoder output gmap from gmap = self.__feature_encoder(inputs) in motion_filter.py is constant.

After lot of debugging, I found out the layer3 of net.fnet outputs zero tensor (layer3 in BasicEncoder).

Currently, I'm working on a Docker Environment, based on pytorch/pytorch:1.9.1-cuda11.1-cudnn8-devel, which is a Ubuntu 18.04 LTS based docker image. And I built the project using environment.yaml.

Could it be a version related problem? If so, could you specify some versions of libraries in environment, other than torch?

+++

After more debugging, I found out the weight of some layers in net.fnet are all zeroes. I'm surely downloaded droid.pth and load_state_dict function did not make any errors.

jugwangjin commented 1 year ago

In MotionFilter class, __init__ function is called normally, but when calling track function, all the internal variables (or, maybe tensors in cuda?) goes zero. even self.MEAN, self.STDV is all zeros when track is called.

jugwangjin commented 1 year ago

I found out when this code below

`

visualizer

    if not self.disable_vis:

        from visualization import droid_visualization

        self.visualizer = Process(target=droid_visualization, args=(self.video,))

        self.visualizer.start()

`

is processed, the variables in self.filterx becomes zero.

is torch.multiprocessing.Process is related to memory access problem?

songlin commented 1 year ago

@jugwangjin Hi, Can you share how you solved this problem? Thank you!

jugwangjin commented 1 year ago

@jugwangjin Hi, Can you share how you solved this problem? Thank you!

My only solution was disabling the visualization. I could not find any reason for this. I'm thinking it is something related to my system, not code in this repository.

adizhol-str commented 4 months ago

I found out when this code below

` # visualizer

    if not self.disable_vis:

        from visualization import droid_visualization

        self.visualizer = Process(target=droid_visualization, args=(self.video,))

        self.visualizer.start()

`

is processed, the variables in self.filterx becomes zero.

is torch.multiprocessing.Process is related to memory access problem?

+1