MCG-NJU / LinK

[CVPR 2023] LinK: Linear Kernel for LiDAR-based 3D Perception
MIT License
83 stars 6 forks source link

Problems in the Evaluation and Submission of segmentation #5

Open Shixiaomeng7 opened 8 months ago

Shixiaomeng7 commented 8 months ago

Thanks for this excellent piece of work! I encountered "Warning: could not find environment variable "-x", mpirun was unable to find the specified executable file, and therefore did not launch the job. Warning: could not find environment variable "-x", mpirun was unable to find the specified executable file, and therefore did not launch the job. This error was first reported for process rank 0; it may have occurred for other processes as well", but I have already run the chmod +x evaluate.sh command, and I would like to know how to create a hyperlink to the semantickitti dataset if it is placed in a different location, and is hyperlink necessary?

inspirelt commented 8 months ago

Hi, thanks for your attention. You may check whether the mpirun is installed correctly according to the instructions. Because there are different versions of mpirun (like open mpi, intel mpi, mpich etc.), and they work in slightly different ways. And, the hyperlink can be created by ln -s stored/path/of/semantickitti data/semantickitti. Feel free to contact me if you have more questions.

Shixiaomeng7 commented 8 months ago

Thank you for such a quick reply, I can now run . /evaluate.sh now, but it doesn't move when it loads to this location, and it doesn't report an error to exit, but it just stops, have you ever encountered this? /root/miniconda3/envs/LinK_seg/bin/python evaluate.py --load_path ../checkpoints/max-iou-val.pt [2024-01-21 08:44:43.634] Experiment started: "runs/run-db770b11". workers_per_gpu: 2 distributed: True amp_enabled: False data: num_classes: 20 ignore_label: 0 training_size: 19132 train: seed: 1588147245 deterministic: False dataset: name: semantic_kitti root: ./data/SemanticKITTI/dataset/sequences num_points: 80000 voxel_size: 0.05 num_epochs: 25 batch_size: 2 model: cr: 1.0 name: linkunet base_op: cos_x r: 2 s: 3 groups: 1 criterion: name: lovasz_softmax ignore_index: 0 optimizer: name: sgd lr: 0.24 weight_decay: 0.0001 momentum: 0.9 nesterov: True scheduler: name: cosine_warmup

Shixiaomeng7 commented 8 months ago

And I've already completed the training once, and it didn't stop moving while training.

inspirelt commented 8 months ago

Well, I've met this before and found several reasons may cause this. You can check: 1. whether the process get stuck in loading data (due to wrong data path); 2. whether the CUDA_VISIBLE_DEVICES is set correctly.