mit-han-lab / bevfusion

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
https://bevfusion.mit.edu
Apache License 2.0
2.29k stars 413 forks source link

Something wrong with MPI_Init #336

Closed YaqinLong closed 1 year ago

YaqinLong commented 1 year ago

Dear author: I want to deploy bevfusion on Jetson Xavier NX, whose dependecies cannot be satisfied. For example, Jetson Xavier NX has CUDA11.4 and it can only install pytorch >=1.11.0(python3.8). After running the code, it said:" It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer)"

kentang-mit commented 1 year ago

If you want to deploy the model on environment with only 1 GPU, I would highly recommend you to slightly refactor the codebase and remove the dependency on torchpack. torchpack is used in this codebase mainly for multi-GPU training, but not single-GPU inference. If you do not have the torchpack dependency, there should not be a problem related to OpenMPI.

kentang-mit commented 1 year ago

Closed due to inactivity. Please feel free to reopen if you feel it necessary.