Multiple errors trying to run SalsaNext

jfhauris commented 3 years ago

Opening arch config file from /home/arl/SalsaNext/TrainEvalResources/SalsaNextTrainedModel [Errno 20] Not a directory: '/home/arl/SalsaNext/TrainEvalResources/SalsaNextTrainedModel/arch_cfg.yaml' Error opening arch yaml file. I downloaded the train model, named it "SalsaNextTrainedModel" and put it in "/home/arl/SalsaNext/TrainEvalResources/"

I do not understand why it is looking for arch_cfg.yaml there. Also arch_cfg.yaml does not exist anywhere in the repo that I cloned.

I have the following, not sure this is correct: ./eval.sh -d /home/arl/SalsaNext/TrainEvalResources/Eval-dataset # location=folder of images to test -p /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions # folder of where to put prediction results -m /home/arl/SalsaNext/TrainEvalResources/SalsaNextTrainedModel # folder/name of downloaded trained model -s valid # eval on validation set, using ```-s validation``` as per instructions does not work -n salsanextExp1 # name of experiment -c 0 # required to avoid error 3. below
./infer.py: error: argument --monte-carlo/-c: invalid int value: '' Fixed by setting -c 0 in arg list
ModuleNotFoundError: No module named 'tasks.semantic.modules.SalsaNextUncertainty' Fixed by changing SalsaNextUncertainty to SalsaNextAdf at line 20 of user.py

TiagoCortinhal commented 3 years ago

Hello! @jfhauris!

arch_cfg.yaml will be the salsanext.yml in the case of the pretrained. Sorry for not having added the file (or this information on the readme)

2 through 4 I will fix this small issues asap but your fixes seem ok!

jfhauris commented 3 years ago

Thanks for the quick response. And sorry for being a pain here. I renamed salsanexct.yml to arch_cfg.ymal and that [Error 20] went away. However a new [Error 2] popped up Opening arch config file from /home/arl/SalsaNext/TrainEvalResources Opening data config file from /home/arl/SalsaNext/TrainEvalResources [Errno 2] No such file or directory: '/home/arl/SalsaNext/TrainEvalResources/data_cfg.yaml' Error opening data yaml file.

Note: this says data_cfg.yaml and NOT data_cfg.ymal. Anyways I cannot find either? Also, can I rename the pre-trained model? How does it know the name of the model then?
It seems like "-m etc" is just a path name and does not include the model name? Is that correct? Thanks, Jon

TiagoCortinhal commented 3 years ago

hey @jfhauris data_cfg.yaml is also on the folder train/tasks/semantic/config (I will update the readme shortly and the defaults of those options to go grab the provided yamls).

The pretained model name should not be changed, no. But that can be easily updated on the tasks/trainer.py line 179 (I kept it like this because I was thinking of having another option to quickly resume/test best val models, but never did it).

jfhauris commented 3 years ago

Hi @TiagoCortinhal In the repo I cloned there is no data_cfg.yaml. I have:

train/tasks/semantic/config/labels/semantic-kitti.yaml and /semantic-kitti-all.yaml

Not sure where to go from here? Jon

TiagoCortinhal commented 3 years ago

Hello @jfhauris. yes it should be semantic-kitti.yaml (the all will be if you want to train the model with the moving classes as stated on the semantickitti website).

Best, Tiago

jfhauris commented 3 years ago

Thanks, @TiagoCortinhal Do you mean make data_cfg.yaml = semantic-kitti.yaml ?

TiagoCortinhal commented 3 years ago

Yes exactly! When training a training from scratch it creates a data_cfg.yaml based on the semantic-kitti.yaml which I forgot to add to the google drive. But yes!

jfhauris commented 3 years ago

@TiagoCortinhal Thanks, now it is not finding Sequences folder. I am not sure what that is but I found this in parser.py line 65. Not sure how to use it: sequences, # sequences for this data (e.g. [1,3,4,6])

model folder exists! Using model from /home/arl/SalsaNext/TrainEvalResources Traceback (most recent call last): File "./infer.py", line 143, in <module> user = User(ARCH, DATA, FLAGS.dataset, FLAGS.log, FLAGS.model,FLAGS.split,FLAGS.uncertainty,FLAGS.monte_carlo) File "../../tasks/semantic/modules/user.py", line 54, in __init__ shuffle_train=False) File "../..//tasks/semantic/dataset/kitti/parser.py", line 327, in __init__ gt=self.gt) File "../..//tasks/semantic/dataset/kitti/parser.py", line 105, in __init__ raise ValueError("Sequences folder doesn't exist! Exiting...") ValueError: Sequences folder doesn't exist! Exiting... finishing infering.

TiagoCortinhal commented 3 years ago

I am glad that issue was fixed!

Regarding the sequences folder is the dataset. It should respect the structure as shown in here.

jfhauris commented 3 years ago

@TiagoCortinhal OK I re-arranged the data to conform to the picture of the data structure on the link and got WAY further!!! But I ran into this error much further down

Opening data config file config/labels/semantic-kitti.yaml Ignoring xentropy class 0 in IoU evaluation [IOU EVAL] IGNORE: tensor([0]) [IOU EVAL] INCLUDE: tensor([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) Traceback (most recent call last): File "./evaluate_iou.py", line 237, in eval(DATA["split"][FLAGS.split],splits,FLAGS.predictions) File "./evaluate_iou.py", line 67, in eval len(label_names) == len(pred_names)) AssertionError

labels: 4071 predictions: 0

What is "eval.sh" actually trying to do? I am trying to infer a semantic image from LIDAR measurements. Will this command do that? Thanks, Jon

TiagoCortinhal commented 3 years ago

The issue is that the eval.sh is trying to find the GT labels (as I assume it is trying to run the inference on the validation set). If you are using your own data I would recommend to check both infer.py anduser.py files. A quick fix will be to edit semantic-kitti.yaml and define your sequences numbers as being part of the test split and passing the -s test option to eval.sh.

jfhauris commented 3 years ago

@TiagoCortinhal What are "sequence numbers" and where/how are they defined and fed into the system?

If all I wanted to do was to infer on a velodyne folder that contained a bunch of xxxxxxx.bin files, how would I do that with your code?

TiagoCortinhal commented 3 years ago

@jfhauris Did the previous run created any files in the folder you chose to save your predictions?

jfhauris commented 3 years ago

@TiagoCortinhal Yes it did, it created the following folder structure but there was nothing in the last folder (i.e. the "predictions" folder - it was empty) /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions/sequences/xx/predictions/

where xx = 00, 01, ... , 21

TiagoCortinhal commented 3 years ago

You mentioned you ran into the error further down.

Did it show it was running inference? I mistook the above error for another one. This one would appear after running the inference and when the evaluation is trying to use the GT files.

jfhauris commented 3 years ago

It says the following, I do not know if this failed or not:

Infering in device:  cpu
Illegal instruction (core dumped)
finishing infering.
 Starting evaluating

The entire output is:


----------
INTERFACE:
dataset /home/arl/Documents/dataset
log /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions
model /home/arl/SalsaNext/TrainEvalResources
Uncertainty False
Monte Carlo Sampling 30
infering valid
----------

----------

Opening arch config file from /home/arl/SalsaNext/TrainEvalResources
Opening data config file from /home/arl/SalsaNext/TrainEvalResources
train 00
train 01
train 02
train 03
train 04
train 05
train 06
train 07
train 09
train 10
valid 08
test 11
test 12
test 13
test 14
test 15
test 16
test 17
test 18
test 19
test 20
test 21
model folder exists! Using model from /home/arl/SalsaNext/TrainEvalResources
Sequences folder exists! Using sequences from /home/arl/Documents/dataset/sequences
parsing seq 00
parsing seq 01
parsing seq 02
parsing seq 03
parsing seq 04
parsing seq 05
parsing seq 06
parsing seq 07
parsing seq 09
parsing seq 10
Using 19130 scans from sequences [0, 1, 2, 3, 4, 5, 6, 7, 9, 10]
Sequences folder exists! Using sequences from /home/arl/Documents/dataset/sequences
parsing seq 08
Using 4071 scans from sequences [8]
Sequences folder exists! Using sequences from /home/arl/Documents/dataset/sequences
parsing seq 11
parsing seq 12
parsing seq 13
parsing seq 14
parsing seq 15
parsing seq 16
parsing seq 17
parsing seq 18
parsing seq 19
parsing seq 20
parsing seq 21
Using 20351 scans from sequences [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
********************************************************************************
Cleaning point-clouds with kNN post-processing
kNN parameters:
knn: 5
search: 5
sigma: 1.0
cutoff: 1.0
nclasses: 20
********************************************************************************
Infering in device:  cpu
Illegal instruction (core dumped)
finishing infering.
 Starting evaluating
********************************************************************************
INTERFACE:
Data:  /home/arl/Documents/dataset
Predictions:  /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions
Split:  valid
Config:  config/labels/semantic-kitti.yaml
Limit:  None
********************************************************************************
Opening data config file config/labels/semantic-kitti.yaml
Ignoring xentropy class  0  in IoU evaluation
[IOU EVAL] IGNORE:  tensor([0])
[IOU EVAL] INCLUDE:  tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
        19])
Traceback (most recent call last):
  File "./evaluate_iou.py", line 237, in <module>
    eval(DATA["split"][FLAGS.split],splits,FLAGS.predictions)
  File "./evaluate_iou.py", line 67, in eval
    len(label_names) == len(pred_names))
AssertionError```

jfhauris commented 3 years ago

@TiagoCortinhal @The0nix Going thru the code it seems like the problem is that it is looking for "predictions", and comparing the number of prediction to the number of label names. I have verified that it is looking at the correct folders. However, I am trying to determine predictions. So I am apparently using this code in the incorrect manner. I have 2 questions at this point:

In using eval.sh with split='valid', why is it looking for predictions and how to I get these values
All I want to do (at this point) is run inference on a velodyne input file and get out a prediction of semantic segments (hopefully as an image).

How do I do this? Jon

TiagoCortinhal commented 3 years ago

You have an Illegal instruction (core dumped) which means the inference failed... Why is that I cannot say, you should try to run an inference without the help of eval.sh to try to pinpoint where it is coming.

The .sh file is meant to simply save me time (and does not stop if any errors occur in the inference. It will simply try to run the evaluation and fail).

If you use valid, it is expected to have the ground truth labels of the sequence. The test split will not, as usually, we do not have access to it in the publicly available datasets. I cannot answer how you can get those values because I do not know what data you are using with the model. If you simply want to infer without evaluating the mIoU of the predictions use split=test.

Keep in mind the eval.sh will still try to evaluate for the test, because as I stated before, it is a simple script meant to save me time.

And could you please tell me why there is another user tagged in this issue? Is he also facing the same problems?

jfhauris commented 3 years ago

If I run with -s test, I get the exact same error. The error is that evaluate_iou.py eval method is falling the assertion: len(label_names) == len(scan_names) and len(label_names) == len(pred_names)) So even with -s test or -s valid, it is still checking for predictions ??? Why?

Why do you say it is doing a core dump? It is failing in the above assertion.

Can you please provide an example of the command on how to run a simple inference on a KITTI sequence velodyne folder like: /home/arl/Documents/dataset/sequences/09/velodyne which holds files 000000.bin to 001590.bin ( which I am assuming are LIDAR scans.

jfhauris commented 3 years ago

There appears to be a huge inconsistency in the code. Depending on whether I use eval.sh … or infer.py … and whether I use -p or -l, I get very different results. Also if use -u True get different results and “UnboundLocalError” in some cases.

Case 1: eval.sh -p … “log” is as defined by -p, and I get a prediction folder but NO values in the folder

Case 2: eval.sh -l … or with -p … & -l ... Illegal option -l Options not found

Case 3: infer.py -p … “log” is defined by some default value!!! (For example: .../logs/2020-12-18-18:41/sequences/08/predictions/xxxxxx.labels) And it has a prediction folder with prediction/label values!!! (xxxxxx.labels) !!! How do I access these xxxxxx.labels so that eval.sh uses them?

Case 4: infer.py -p … and -l … Notice that “log” is as defined by -l, and I get a NO prediction folder but NO values in any folder

Case 5: infer.py -l … “log” is as defined by -l, and I get a NO prediction folder and NO values in the folder

=========================================================================== Full details of output following here: if I run the following cases I get the associated following results.

Case 1: eval.sh -p ... Notice that “log” is as defined by -p, and I get a prediction folder but NO values in the folder


----------
INTERFACE:
dataset /home/arl/Documents/dataset
log /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions
model /home/arl/SalsaNext/TrainEvalResources
Uncertainty False
Monte Carlo Sampling 30
infering train
----------

Case 2: eval.sh -l …  or with -p … & -l ...
```./eval.sh -d /home/arl/Documents/dataset -p /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions -m /home/arl/SalsaNext/TrainEvalResources/ -s train -n salsanextExp1 -c 30 -l /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP
Illegal option -l
Options not found```

Case 3: infer.py -p …
Notice that “log” is defined by some default value!!! 
(For example:
.../logs/2020-12-18-18:41/sequences/08/predictions/xxxxxx.labels)
And it has a prediction folder with prediction/label values!!!
```(salsanext) arl@irb3-5571I:~/SalsaNext/train/tasks/semantic$ python ./infer.py -d /home/arl/Documents/dataset -p /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions -m /home/arl/SalsaNext/TrainEvalResources/ -s valid  -n salsanextExp1 -c 30
----------
INTERFACE:
dataset /home/arl/Documents/dataset
log /home/arl/logs/2020-12-18-17:59/
model /home/arl/SalsaNext/TrainEvalResources/
Uncertainty False
Monte Carlo Sampling 30
infering valid
----------```

Case 4: infer.py -p … and -l …
Notice that “log” is as defined by -l, and I get a NO prediction folder but NO values in any folder
```(salsanext) arl@irb3-5571I:~/SalsaNext/train/tasks/semantic$ python ./infer.py -d /home/arl/Documents/dataset -p /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions -m /home/arl/SalsaNext/TrainEvalResources/ -s valid  -n salsanextExp1 -c 30 -l /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP
----------
INTERFACE:
dataset /home/arl/Documents/dataset
log /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP
model /home/arl/SalsaNext/TrainEvalResources/
Uncertainty False
Monte Carlo Sampling 30
infering valid
----------```

Case 5: infer.py -l …
“log” is defined by -i but get no prediction folder and no data
```(salsanext) arl@irb3-5571I:~/SalsaNext/train/tasks/semantic$ python ./infer.py -d /home/arl/Documents/dataset  -m /home/arl/SalsaNext/TrainEvalResources/ -s valid  -n salsanextExp1 -c 30 -l /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP
----------
INTERFACE:
dataset /home/arl/Documents/dataset
log /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP
model /home/arl/SalsaNext/TrainEvalResources/
Uncertainty False
Monte Carlo Sampling 30
infering valid
----------```

Case 6: infer.py -l …    -u True
“log” is defined by -l and get some weird “UnboundLocalError”
```(salsanext) arl@irb3-5571I:~/SalsaNext/train/tasks/semantic$ python ./infer.py -d /home/arl/Documents/dataset  -m /home/arl/SalsaNext/TrainEvalResources/ -s valid  -n salsanextExp1 -c 30 -l /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP -u True
----------
INTERFACE:
dataset /home/arl/Documents/dataset
log /home/arl/SalsaNext/TrainEvalResources/SaveLabelPredictions-EXP
model /home/arl/SalsaNext/TrainEvalResources/
Uncertainty True
Monte Carlo Sampling 30
infering valid
----------```

```Traceback (most recent call last):
  File "./infer.py", line 146, in <module>
    user.infer()
  File "../../tasks/semantic/modules/user.py", line 107, in infer
    to_orig_fn=self.parser.to_original, cnn=cnn, knn=knn)
  File "../../tasks/semantic/modules/user.py", line 165, in infer_subset
    proj_argmax,
UnboundLocalError: local variable 'proj_argmax' referenced before assignment```

TiagoCortinhal / SalsaNext

Multiple errors trying to run SalsaNext #42