Open guragamb opened 3 years ago
Hi, thanks for your inform. It seems that some dependencies have conflicts now. I have updated a new requirements file and some codes, and tested on a new machine. Everything seems to work now.
Meanwhile, these codes needs large GPU memory. I train V321 with mini-batch 2 on each GPU (24G) and V351 with mini-batch 1 on each GPU (24G). The whole networks are trained on 8 GPUs with syncBN. If you don't have much GPU memory, I highly recommend you to use less width setting, e.g, 1024 or 512, and try training from my pretrained models instead of training from scratch. We discover that this training method can be close to training from scratch with 2048 width.
Hi! I was trying to get the repo working but it gets stuck right before the training begins (all the files are in the correct directories, python packages are exactly the same as what you outline in the requirements.txt file).
It gets stuck after the last line and doesn't do anything (nothing gets updated to the logs either). I know you mentioned that you referenced the RangeNet++ project for the development of SqueezeSegV3 and they have a similar issue where training gets stuck (I was able to reproduce the error on their repo as well) - https://github.com/PRBonn/lidar-bonnetal/issues/39. Would really appreciate if you had any thoughts on why this might be happening!