Closed liangjxiong closed 2 years ago
@liangjxiong Hi! Could you give some details, e.g., which model are you using and which layer/part of the network gives that error?
Thank you! I have solved the above problems, but I have encountered new problems. The run command I use is python main_landet.py --train --config=configs/lane_detection/scnn/resnet18_tusimple.py --mixed-precision. The following error occurred:
[1, 1] training loss: 1.9832
[1, 1] loss seg: 1.9147
[1, 1] loss exist: 0.6852
[1, 2] training loss: 1.9865
[1, 2] loss seg: 1.9177
[1, 2] loss exist: 0.6883
[1, 3] training loss: 1.3278
[1, 3] loss seg: 1.2590
[1, 3] loss exist: 0.6879
[1, 4] training loss: 0.6099
[1, 4] loss seg: 0.5421
[1, 4] loss exist: 0.6780
[1, 5] training loss: 0.9194
[1, 5] loss seg: 0.8511
[1, 5] loss exist: 0.6827
Traceback (most recent call last):
File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/main_landet.py", line 65, in
My label is 5 to 715 lines, not 240 to 710 lines. Can the model accept this change?
@liangjxiong it seems your dataset does not load same dimension tensors for each image? As for 5-715, it should work correct if you provide the start, ppl, end info correctly in your Dataset class.
The resolution of image and label is 1280 * 720. But some pictures contain four lane lines, and some pictures have three lane lines. Will these have an impact?
The resolution of image and label is 1280 * 720. But some pictures contain four lane lines, and some pictures have three lane lines. Will these have an impact?
FYI, of course these kind of tensors can't be simply batched. That is why all our non-segmentation methods use a dict to contain labels and apply dict_collate_fn and stack GT in the loss class. However, if you are using SCNN, the segmentation labels should be in the same format? Does your lane_existence GT have different shapes?
Sorry, I didn't study deeply. What is the format of the segmentation labels? What does "lane_existence GT have different shapes" mean? We are all straight lines. What should I do now?
@liangjxiong If you refer to the TuSimple/CULane/LLAMAS labeling, you can find their labels have two parts for each image. 1. the segmentation mask, which is H x W x (C + 1), C (max possible number of lanes in pair, we'll get to that later) and extra 1 for background, actually it is often stored as H x W x 1. 2. the lane existence classification label, which determines the existence of each lane class, which is C. And C is the same for each image, if some image don't have C lanes, they simply have 0 on those lanes' labels.
Note that C is not the maximal possible number of lanes. Lanes are classified as ego-lanes (2 of them), left-right immediate lanes (another 2), and you can add more pairs (for instance TuSimple considers 6). For your case, probably it is 4.
train.txt file format: train / image / 345 1 1 1 0 0 0 Is that right?
@liangjxiong it does seem quite correct, as long as all your images have the same 6 existence flags at the end.
You can add some print
or debug around here to see if some line in train.txt
has parsing issues or 7 flags.
Thank you! I can use my dataset to train the model.
Thank you! I can use my dataset to train the model.
Sounds great! Since you resolved the problem, I'll close this issue. But do feel free to reopen.
Hello! I have a new problem. Since the number of rows in my own dataset is not 56, the prediction output of the model is 56 rows. A format error occurred while running the tusimple test script. How can I modify it to make the predicted output of the model conform to the number of rows I set. thank you!
Now, the output of the model has been changed to the number of lines I want, but the output still starts from 160 lines. I want to start from line 5 with an interval of 10. How should I modify it? thank you!
@liangjxiong do you mean 5, 160 and 10 by pixels?
You can search for things like 160
, or if dataset == 'tusimple'
.
For instance, they are set in these places for tusimple testing:
I'd suggest adding your own customized codes (elif) for your customized dataset, although modifying the tusimple code is also fine,
Thank you! The above problems have been solved. Can I test the reasoning speed of the model?
@liangjxiong You can refer to BENCHMARK.md for speed testing.
run:python tools/profiling.py --mode=simple --config=configs/lane_detection/scnn/resnet18_tusimple.py --times=3 --height=720 --width=1280
The following error occurred:
Traceback (most recent call last):
File "tools/profiling.py", line 39, in
run:python tools/profiling.py --mode=simple --config=configs/lane_detection/scnn/resnet18_tusimple.py --times=3 --height=720 --width=1280 The following error occurred: Traceback (most recent call last): File "tools/profiling.py", line 39, in
cfg = read_config(args.config) File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/utils/args.py", line 57, in read_config module = SourceFileLoader(module_name, config_path).load_module() File " ", line 399, in _check_name_wrapper File " ", line 823, in load_module File " ", line 682, in load_module File " ", line 265, in _load_module_shim File " ", line 684, in _load File " ", line 665, in _load_unlocked File " ", line 678, in exec_module File " ", line 219, in _call_with_frames_removed File "configs/lane_detection/scnn/resnet18_tusimple.py", line 2, in from configs.lane_detection.common.datasets.tusimple_seg import dataset ModuleNotFoundError: No module named 'configs.lane_detection'
It seems you are the second one to have this issue, I will look into it again later on. In the meantime, try move tools/profiling.py to profiling.py.
Thank you! I'm looking forward to it.
@liangjxiong I still can't figure out the reason why loading python files fail on certain environments, in fact I build new envs and they all work fine. However I do have a recommended solution (other than copying files out of ./tools
):
export PYTHONPATH=$PWD:$PYTHONPATH
Execute that when you are in the pytorch-auto-drive
folder and everything should be fine.
If anyone figures this out, please post here and let everyone know.
Thank you! I'm looking forward to it.
This should be fixed by #86 . Feel free to test and reopen if the problem persists.
你好!我执行这命令之后export PYTHONPATH=$PWD:$PYTHONPATH。出现了新的问题!
(pad) xianjin@xianjin-W580-G20:/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive$ python tools/profiling.py --mode=simple --config=configs/lane_detection/scnn/resnet18_tusimple.py --times=3 --height=720 --width=1280
Loaded torchvision ImageNet pre-trained weights V1.
cuda:0
torch.float32
Traceback (most recent call last):
File "tools/profiling.py", line 55, in
@liangjxiong your profiling height and width may be not aligned with the config model. I think it should be h=360, w=640
现在可以运行了。Tusimple数据集h=720 w=1280, 是不是先压缩了一半才输入模型的?
现在可以运行了。Tusimple数据集h=720 w=1280, 是不是先压缩了一半才输入模型的?
yes. 这个是大家的默认输入大小。
我懂了,谢谢您!
您好!我刚刚更新的您的代码,出现了下面错误。
pad) xianjin@xianjin-W580-G20:/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive$ CUDA_VISIBLE_DEVICES=1 python main_landet.py --train --config=configs/lane_detection/scnn/resnet18_tusimple.py --mixed-precision
Loaded torchvision ImageNet pre-trained weights V1.
Not using distributed mode
cuda
Traceback (most recent call last):
File "main_landet.py", line 64, in
@liangjxiong 我好像少commit了一次。试试现在的master呢
太感谢了!正常运行!
你好!还是之前的数据集,运行SCNN是正常的,但我执行python main_landet.py --train --config=configs/lane_detection/lstr/resnet18s_tusimple.py。就出现线面错误:
Not using distributed mode
cuda
Loading targets into memory...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 227/227 [00:00<00:00, 643.67it/s]
Traceback (most recent call last):
File "main_landet.py", line 65, in
你好!还是之前的数据集,运行SCNN是正常的,但我执行python main_landet.py --train --config=configs/lane_detection/lstr/resnet18s_tusimple.py。就出现线面错误: Not using distributed mode cuda Loading targets into memory... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 227/227 [00:00<00:00, 643.67it/s] Traceback (most recent call last): File "main_landet.py", line 65, in
runner.run() File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/utils/runners/lane_det_trainer.py", line 52, in run self.model) File "/home/xianjin/anaconda3/envs/pad/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/utils/losses/hungarian_loss.py", line 124, in forward loss, log_dict = self.calc_full_loss(outputs=outputs, targets=targets) File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/utils/losses/hungarian_loss.py", line 136, in calc_full_loss indices = self.matcher(outputs=outputs, targets=targets) File "/home/xianjin/anaconda3/envs/pad/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/home/xianjin/anaconda3/envs/pad/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context return func(args, kwargs) File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/utils/losses/hungarian_loss.py", line 71, in forward norm_weights, valid_points = lane_normalize_in_batch(target_keypoints) # G, G x N File "/home/xianjin/anaconda3/envs/pad/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context return func(*args, **kwargs) File "/data/disk1/liangjxiong/pycharm_project/pytorch-auto-drive/pytorch-auto-drive/utils/losses/hungarian_loss.py", line 24, in lane_normalize_in_batch norm_weights /= norm_weights.max() RuntimeError: operation does not have an ident
let me verify if that is a bug tomorrow.
谢谢您!我已经解决了!是我数据集设置的问题。您的代码没有问题。
-_- ok那我close了
谢谢您!我已经解决了!是我数据集设置的问题。您的代码没有问题。
请问下是怎么解决的,我也遇到同样问题了,跑lstr的时候这里报错
Hello, I trained with my own dataset, and the following error occurred: Target size (torch.Size([20, 4])) must be the same as input size (torch.Size([20, 6])). How should I solve the above problems. thank you!