Closed ZhouYC-X closed 9 months ago
Sorry, I failed to reproduce this situation.
OpenLane-V2 Score - 0.35960268760767244
DET_l - 0.2435305118560791
DET_t - 0.5265672206878662
TOP_ll - 0.06714055449349493
TOP_lt - 0.1674430632156414
F-Score for 3D Lane - 0.13894095268823822
{'OpenLane-V2 Score': 0.35960268760767244, 'DET_l': 0.24353051, 'DET_t': 0.5265672, 'TOP_ll': 0.06714055449349493, 'TOP_lt': 0.1674430632156414}
Can you provide the full log of the evaluation procedure?
Thank you for your prompt response.
This is the log of the first evaluation failure.
WARNING:__main__:*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
[ ] 0/6019, elapsed: 0s, ETA:/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
[ ] 1/6019, 0.0 task/s, elapsed: 72s, ETA: 435457s
[ ] 2/6019, 0.0 task/s, elapsed: 72s, ETA: 217694s
[ ] 3/6019, 0.0 task/s, elapsed: 72s, ETA: 145105s
[ ] 4/6019, 0.1 task/s, elapsed: 72s, ETA: 108811s/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
[ ] 5/6019, 0.1 task/s, elapsed: 73s, ETA: 87223s
[ ] 6/6019, 0.1 task/s, elapsed: 73s, ETA: 72674s
[ ] 7/6019, 0.1 task/s, elapsed: 73s, ETA: 62281s
[ ] 8/6019, 0.1 task/s, elapsed: 73s, ETA: 54487s
[ ] 9/6019, 0.1 task/s, elapsed: 73s, ETA: 48508s
[ ] 10/6019, 0.1 task/s, elapsed: 73s, ETA: 43650s
[ ] 11/6019, 0.2 task/s, elapsed: 73s, ETA: 39675s
[ ] 12/6019, 0.2 task/s, elapsed: 73s, ETA: 36363s/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
[ ] 13/6019, 0.2 task/s, elapsed: 73s, ETA: 33621s
[ ] 14/6019, 0.2 task/s, elapsed: 73s, ETA: 31214s
[>>>>>>>>>>>>>>>>>>>>>>>>> ] 6014/6019, 25.3 task/s, elapsed: 238s, ETA: 0s
[>>>>>>>>>>>>>>>>>>>>>>>>> ] 6015/6019, 25.3 task/s, elapsed: 238s, ETA: 0s
[>>>>>>>>>>>>>>>>>>>>>>>>> ] 6016/6019, 25.3 task/s, elapsed: 238s, ETA: 0s2024-01-23 14:31:20,166 - mmdet - INFO - Starting format results...
2024-01-23 14:39:55,273 - mmdet - INFO - Starting openlanev2 evaluate...
Traceback (most recent call last):
File "tools/test.py", line 266, in <module>
main()
File "tools/test.py", line 262, in main
print(dataset.evaluate(outputs, **eval_kwargs))
File "/code/TopoLane/TopoNet-main/projects/toponet/datasets/openlanev2_subset_A_dataset.py", line 364, in evaluate
metric_results = openlanev2_evaluate(gt_dict, pred_dict)
File "/code/TopoLane/OpenLane-V2/openlanev2/evaluation/evaluate.py", line 561, in evaluate
preds[token] = predictions[token]['predictions']
KeyError: ('val', '11149', '1542799760912460')
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 105818) of binary: /opt/anaconda3/bin/python
/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py:367: UserWarning:
**********************************************************************
CHILD PROCESS FAILED WITH NO ERROR_FILE
**********************************************************************
CHILD PROCESS FAILED WITH NO ERROR_FILE
Child process 105818 (local_rank 0) FAILED (exitcode 1)
Error msg: Process failed with exitcode 1
Without writing an error file to <N/A>.
While this DOES NOT affect the correctness of your application,
no trace information about the error will be available for inspection.
Consider decorating your top level entrypoint function with
torch.distributed.elastic.multiprocessing.errors.record. Example:
from torch.distributed.elastic.multiprocessing.errors import record
@record
def trainer_main(args):
# do train
**********************************************************************
warnings.warn(_no_error_file_warning_msg(rank, failure))
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 702, in <module>
main()
File "/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 361, in wrapper
return f(*args, **kwargs)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 698, in main
run(args)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
elastic_launch(
File "/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/anaconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
***************************************
tools/test.py FAILED
=======================================
Root Cause:
[0]:
time: 2024-01-23_14:40:04
rank: 0 (local_rank: 0)
exitcode: 1 (pid: 105818)
error_file: <N/A>
msg: "Process failed with exitcode 1"
=======================================
Other Failures:
<NO_OTHER_FAILURES>
***************************************
According to the error message in the log, I added the following lines of code in OpenLane-V2/openlanev2/evaluation/evaluate.py
for token in ground_truth.keys():
if token not in predictions.keys(): # to fix line 561 error
continue
gts[token] = ground_truth[token]['annotation']py
Resulting in the second evaluation log as follows.
WARNING:__main__:*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
load checkpoint from local path: ckpt/toponet_r50_8x1_24e_olv2_subset_B.pth
[ ] 0/6019, elapsed: 0s, ETA:/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
/opt/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
[ ] 1/6019, 0.0 task/s, elapsed: 70s, ETA: 421610s
[ ] 2/6019, 0.0 task/s, elapsed: 70s, ETA: 210771s
[>>>>>>>>>>>>>>>>>>>>>>>>> ] 6015/6019, 25.8 task/s, elapsed: 233s, ETA: 0s
[>>>>>>>>>>>>>>>>>>>>>>>>> ] 6016/6019, 25.8 task/s, elapsed: 233s, ETA: 0s2024-01-23 14:53:24,340 - mmdet - INFO - Starting format results...
2024-01-23 15:01:46,309 - mmdet - INFO - Starting openlanev2 evaluate...
len(gkeys):6019, len(pkeys):6016
len(gts.keys()):6016, len(preds.keys()):6016
calculating distances:: 0%| | 0/6016 [00:00<?, ?it/s]
calculating distances:: 0%| | 1/6016 [00:00<28:39, 3.50it/s]
calculating distances:: 100%|███████████████| 6016/6016 [26:23<00:00, 3.54it/s]
calculating distances:: 100%|███████████████| 6016/6016 [26:23<00:00, 3.80it/s]
/opt/anaconda3/lib/python3.8/site-packages/scipy/interpolate/_interpolate.py:641: RuntimeWarning: divide by zero encountered in true_divide
slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
/opt/anaconda3/lib/python3.8/site-packages/scipy/interpolate/_interpolate.py:641: RuntimeWarning: invalid value encountered in true_divide
slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
OpenLane-V2 Score - 0.19636668752454212
DET_l - 0.13499410450458527
DET_t - 0.1379755288362503
TOP_ll - 0.043555243510227305
TOP_lt - 0.09229333809205244
F-Score for 3D Lane - 0.10671664825889586
{'OpenLane-V2 Score': 0.19636668752454212, 'DET_l': 0.1349941, 'DET_t': 0.13797553, 'TOP_ll': 0.043555243510227305, 'TOP_lt': 0.09229333809205244}
I downloaded the OpenLanev2 data from opendatalab and set up the environment according to the TopoNet and openlanev2 repositories. Further, I just ran subset_A val, and there is also a performance difference, which is indeed very strange.
I wonder if you could provide your conda environment information so that I can reconfigure and run the tests again?
I check the fisrt failure log and use the correct config files(actually, I modify the log to hide some personal information)
File "/code/TopoLane/TopoNet-main/projects/toponet/datasets/openlanev2_subset_A_dataset.py", line 364
I re-setup a clean environment and achieve the results in repo. I guess there might be conflicts with some dependent packages or multiple installations of openlanev2 that caused the issue.
Thank you for your time.
Thank you for your great work. Recently, I downloaded the ckpt file for subsetB and evaluated it on the subsetB-val. However, the performance is significantly different from what is provided in repo.
I checked the relevant dependencies for evaluation and have not found the reason. Could you please provide some advice? Looking forward for your reply. Thank you