When I run torchpack dist-run -np 1 python evaluate.py configs/semantic_kitti/default.yaml --name SemanticKITTI_val_SPVNAS@65GMACs
then I got following error messages
[2021-08-30 17:23:50.180] /usr/local/anaconda3/envs/torch/bin/python evaluate.py configs/semantic_kitti/default.yaml --name SemanticKITTI_val_SPVNAS@65GMACs
[2021-08-30 17:23:50.181] Experiment started: "runs/run-98ebafa2-a0dc3bdc".
workers_per_gpu: 8
num_classes: 19
ignore_label: 255
training_size: 19132
seed: 1588147245
deterministic: False
name: semantic_kitti
root: /dataset/semantic-kitti
num_points: 80000
voxel_size: 0.05
num_epochs: 15
batch_size: 2
name: cross_entropy
ignore_index: 255
name: sgd
lr: 0.24
weight_decay: 0.0001
momentum: 0.9
nesterov: True
name: cosine_warmup
Traceback (most recent call last):
File "evaluate.py", line 130, in
File "evaluate.py", line 62, in main
model = spvnas_specialized(args.name)
File "/home/pjy/spvnas/model_zoo.py", line 51, in spvnas_specialized
if torch.cuda.is_available() else 'cpu')['model']
File "/usr/local/anaconda3/envs/torch/lib/python3.6/site-packages/torch/serialization.py", line 587, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/anaconda3/envs/torch/lib/python3.6/site-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[59182,1],0]
Exit code: 1
but I can run torchpack dist-run -np [num_of_gpus] python train.py configs/semantic_kitti/spvcnn/cr0p5.yaml successfully, and i got the best mIoU of 59.466 on one GTX 1080Ti GPU
When I run torchpack dist-run -np 1 python evaluate.py configs/semantic_kitti/default.yaml --name SemanticKITTI_val_SPVNAS@65GMACs then I got following error messages [2021-08-30 17:23:50.180] /usr/local/anaconda3/envs/torch/bin/python evaluate.py configs/semantic_kitti/default.yaml --name SemanticKITTI_val_SPVNAS@65GMACs [2021-08-30 17:23:50.181] Experiment started: "runs/run-98ebafa2-a0dc3bdc". workers_per_gpu: 8 data: num_classes: 19 ignore_label: 255 training_size: 19132 train: seed: 1588147245 deterministic: False dataset: name: semantic_kitti root: /dataset/semantic-kitti num_points: 80000 voxel_size: 0.05 num_epochs: 15 batch_size: 2 criterion: name: cross_entropy ignore_index: 255 optimizer: name: sgd lr: 0.24 weight_decay: 0.0001 momentum: 0.9 nesterov: True scheduler: name: cosine_warmup Traceback (most recent call last): File "evaluate.py", line 130, in
File "evaluate.py", line 62, in main
model = spvnas_specialized(args.name)
File "/home/pjy/spvnas/model_zoo.py", line 51, in spvnas_specialized
if torch.cuda.is_available() else 'cpu')['model']
File "/usr/local/anaconda3/envs/torch/lib/python3.6/site-packages/torch/serialization.py", line 587, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/anaconda3/envs/torch/lib/python3.6/site-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:
Process name: [[59182,1],0] Exit code: 1
but I can run torchpack dist-run -np [num_of_gpus] python train.py configs/semantic_kitti/spvcnn/cr0p5.yaml successfully, and i got the best mIoU of 59.466 on one GTX 1080Ti GPU