valeoai / WaffleIron

Other
39 stars 6 forks source link

ZeroDivisionError: division by zero. #2

Closed Chenfreeyang closed 9 months ago

Chenfreeyang commented 1 year ago

the problem is : waffleiron/utils/scheduler.py: iter = (iter / max_iter) * np.pi

hello , I got this error and think there are something wrong with my dataset folder tree. My dataset folder structure can train networks such as salsanext. And the version of environment is cuda11.3+cudnn8.2.1+pytorch1.11.0.

gpuy commented 1 year ago

Dear @Chenfreeyang,

On which dataset do you have this issue?

For example, for SemanticKITTI, I set export PATH_KITTI="/all_datasets/semantic_kitti/" and (on my side) the folder is structured as follows:

/all_datasets/
 | ...
 |- semantic_kitti/
    | ...
    | - dataset/
        | - sequences/
            | - 00/
                | - velodyne/
                    | - 000000.bin
                    | ...
            | - 01/
                | - velodyne/
                    | - 000000.bin
                    | ...
Chenfreeyang commented 1 year ago

Well ,thanks for your reply @gpuy .and i use the SemanticKITTI and the structure is same as yours.

i set my dataset

export PATH_KITTI="/media/chen/chen/dataset/KITTI-Semantic/data_odometry_velodyne/"
/media/chen/chen/dataset/KITTI-Semantic/data_odometry_velodyne
└── dataset
    └── sequences
        ├── 00
        │   ├── labels
        │   └── velodyne
        ├── 01
        │   ├── labels
        │   └── velodyne
        ├── 02
        │   ├── labels
        │   └── velodyne

but i still got this error.

  File "/home/chen/segmentation/network/waffleiron/utils/scheduler.py", line 34, in __call__
    iter = (iter / max_iter) * np.pi
ZeroDivisionError: division by zero

so i make the max_iter = 1 , like this

    def __call__(self, iter):
        if iter < self.warmup_end:
            factor = iter / self.warmup_end
        else:
            iter = iter - self.warmup_end
            max_iter = self.max_iter - self.warmup_end
            # ------------------test--------------------
            # if max_iter==0:
            #     max_iter=1
            iter = (iter / max_iter) * np.pi
            factor = self.factor_min + 0.5 * (1 - self.factor_min) * (np.cos(iter) + 1)

but this time i still can't eval or train

Trainer on gpu: None. World size:1.
Checkpoint loaded on cuda:0 (cuda:0): ./pretrained_models/WaffleIron-48-256__40cm-BEV-cutmix-kitti/

Validation: 45/50 epochs
       0%|                                                  | 0/0 [00:00<?, ?it/

btw, i found the paper said that in this work, we project the points on one of the planes (x,y) , (x,z) and (y,z). Planes(x,y) is BEV, but i don't know what other projector features have any advantages? And i can't find any projection operations for other projectors in the code.

gpuy commented 1 year ago

So the folder structure seems fine. It is strange but the most likely reason is still that the list of files is empty in the dataloader.

In order to understand what is happening, can you add a breakpoint on that line or directly print:

In particular:

Did you made any other changes in the code or config file?

The code has been tested outside our lab succesfully so we should be able to make it work for you as well.