Double-zh / ByteTrack

ByteTrack超详细教程!!!---训练自己的数据集(VOC格式)&&摄像头实时检测跟踪
MIT License
85 stars 14 forks source link

Processing frame 0 error. #1

Open ImSuMyatNoe opened 3 years ago

ImSuMyatNoe commented 3 years ago

When I tried to run the code, that i recived from the YOLOX-JSON Annotaion Format, i got the best_ckpt.pth file. Then, I added into the bytetrack. Although it does not show any error code and stops at 0 fps. Then, I do not know what to do. Could you please take a look ?

2021-11-15 15:09:45.537 | INFO | main:imageflow_demo:250 - Processing frame 0 (100000.00 fps)

199734 commented 3 years ago

When I tried to run the code, that i recived from the YOLOX-JSON Annotaion Format, i got the best_ckpt.pth file. Then, I added into the bytetrack. Although it does not show any error code and stops at 0 fps. Then, I do not know what to do. Could you please take a look ?

2021-11-15 15:09:45.537 | INFO | main:imageflow_demo:250 - Processing frame 0 (100000.00 fps)

Please check whether your camera is successfully detected and turned on.

ImSuMyatNoe commented 3 years ago

@199734 I run with the video dataset. Not From the live camera.

Double-zh commented 3 years ago

@199734 I run with the video dataset. Not From the live camera.

Cancel the annotation of 227th line of code, and then annotate 228th line of code.

ImSuMyatNoe commented 3 years ago

cap = cv2.VideoCapture(args.path if args.demo == "video" else args.camid)

cap = cv2.VideoCapture(1)

@Double-zh yes, I already run with this setting. Is there any other error depending on the "best_ckpt.pth.tar" and "best_ckpt.pth" file. Because I run with "best_ckpt.pth" file output from the YOLOX.

Double-zh commented 3 years ago

cap = cv2.VideoCapture(args.path if args.demo == "video" else args.camid) #cap = cv2.VideoCapture(1)

@Double-zh yes, I already run with this setting. Is there any other error depending on the "best_ckpt.pth.tar" and "best_ckpt.pth" file. Because I run with "best_ckpt.pth" file output from the YOLOX.

They are the same. You can choose the format to save during training.

Double-zh commented 3 years ago

@Double-zh I have attached the error in the following link, which is a png file. Could you please kindly take a look ? https://drive.google.com/file/d/1Q9GhaPG5W0SNDsgXDE-peE-34kFTMbdO/view?usp=sharing

Please make sure the file name of your video and its path are correct.

ImSuMyatNoe commented 3 years ago

@Double-zh Yea, these are all correct.

Double-zh commented 3 years ago

@Double-zh Yea, these are all correct.

I have no problem testing on multiple computers. You can repeat the steps according to readme.

ImSuMyatNoe commented 3 years ago

@Double-zh I found the problem and the problem is about my path error. But when i tried with .pth.tar file, the tracking result is out. But, when i tried it with .pth file only, the tracking result is not out. Do you know any difference between them? Is there any place where can I change .pth.tar format in the code while training. I looked at it but I haven't found it yet. Could you shed a light to me? Thank you very much for your awesome job. It is much easier than before. :)

Double-zh commented 3 years ago

@Double-zh I found the problem and the problem is about my path error. But when i tried with .pth.tar file, the tracking result is out. But, when i tried it with .pth file only, the tracking result is not out. Do you know any difference between them? Is there any place where can I change .pth.tar format in the code while training. I looked at it but I haven't found it yet. Could you shed a light to me? Thank you very much for your awesome job. It is much easier than before. :)

It can be found and modified on lines 40 and 43 of checkpoint.py (in yolox/utils/). The two format models obtained after training can be used for detection and tracking.

NaifahNurya commented 3 years ago

When I tried to run the code, that i recived from the YOLOX-JSON Annotaion Format, i got the best_ckpt.pth file. Then, I added into the bytetrack. Although it does not show any error code and stops at 0 fps. Then, I do not know what to do. Could you please take a look ?

2021-11-15 15:09:45.537 | INFO | main:imageflow_demo:250 - Processing frame 0 (100000.00 fps)

Hi @ImSuMyatNoe & @Double-zh , can you gives the details how you managed to train on custom dataset which are in the coco format with annotation.json file. I read the instruction but didn't managed to train. I have the dataset in coco format, with no frame ID and video ID as yours.

If you can share the steps you did, and which file/lines you have modified.

When train got the following error.

  File "C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\exps/example/custom\yolox_market_ZZH.py", line 46, in get_data_loader
    dataset = VOCDetection(
              └ <class 'yolox.data.datasets.voc.VOCDetection'>

  File "C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\yolox\data\datasets\voc.py", line 126, in __init__
    for line in open(

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\admin\\Desktop\\Tracking\\Byte-DoubleZH\\datasets\\VOCdevkit\\VOC2012\\ImageSets\\Main\\market.txt'
ImSuMyatNoe commented 3 years ago

@NaifahNurya We don't need to change anything. I can successfully run the code without the need for Frame ID and Video ID. Only we need to change is our best_pth checkpoint file and our input video path.

NaifahNurya commented 3 years ago

@ImSuMyatNoe Thank you for reply, but my question is how to train and arrange those custom dataset? Which part we of file/line we need to modify in order to train on those customdataset(my own dataset), so as to get best_pth checkpoint file.

NaifahNurya commented 3 years ago

I follow the instructions as shown in this repo, however got an error.

File "C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\exps/example/custom\yolox_market_ZZH.py", line 46, in get_data_loader dataset = VOCDetection( └ <class 'yolox.data.datasets.voc.VOCDetection'>

File "C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\yolox\data\datasets\voc.py", line 126, in init for line in open(

FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\datasets\VOCdevkit\VOC2012\ImageSets\Main\market.txt'

I have the json file and their images in coco format. How you arranged those data, which file/line you have modified to pint to that path of your custom dataset.

Double-zh commented 3 years ago

I follow the instructions as shown in this repo, however got an error.

File "C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\exps/example/custom\yolox_market_ZZH.py", line 46, in get_data_loader dataset = VOCDetection( └ <class 'yolox.data.datasets.voc.VOCDetection'>

File "C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\yolox\data\datasets\voc.py", line 126, in init for line in open(

FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\admin\Desktop\Tracking\Byte-DoubleZH\datasets\VOCdevkit\VOC2012\ImageSets\Main\market.txt'

I have the json file and their images in coco format. How you arranged those data, which file/line you have modified to pint to that path of your custom dataset.

Hello, you need to convert the coco format to VOC format, and then put it into the corresponding folder according to readme, so you can run the training and testing code.

NaifahNurya commented 3 years ago

@ImSuMyatNoe , Thank you for reply, can you share the scripts you use to convert coco format to VOC format?

NaifahNurya commented 3 years ago

@ImSuMyatNoe and @Double-zh , I manage to solve dataset issue, however I face another problem (error) when running the first epoch during training,

` Traceback (most recent call last): File "C:\Users\admin\Desktop\Tracking\ByteZH\train.py", line 10, in from yolox.core import Trainer, launch File "C:\Users\admin\Desktop\Tracking\ByteZH\yolox__init__.py", line 4, in from .utils import configure_module File "C:\Users\admin\Desktop\Tracking\ByteZH\yolox\utils__init__.py", line 15, in from .setup_env import * File "C:\Users\admin\Desktop\Tracking\ByteZH\yolox\utils\setup_env.py", line 5, in sys.path.remove('/opt/ros/kinetic/lib/python2.7/dist-packages') ValueError: list.remove(x): x not in list

`

I have tried several solution, however still not managed to solve.

Have any one encountered this kind of problem?

Any suggested solution?

Double-zh commented 3 years ago

@ImSuMyatNoe and @Double-zh , I manage to solve dataset issue, however I face another problem (error) when running the first epoch during training,

` Traceback (most recent call last): File "C:\Users\admin\Desktop\Tracking\ByteZH\train.py", line 10, in from yolox.core import Trainer, launch File "C:\Users\admin\Desktop\Tracking\ByteZH\yoloxinit.py", line 4, in from .utils import configure_module File "C:\Users\admin\Desktop\Tracking\ByteZH\yolox\utilsinit.py", line 15, in from .setup_env import * File "C:\Users\admin\Desktop\Tracking\ByteZH\yolox\utils\setup_env.py", line 5, in sys.path.remove('/opt/ros/kinetic/lib/python2.7/dist-packages') ValueError: list.remove(x): x not in list

`

I have tried several solution, however still not managed to solve.

Have any one encountered this kind of problem?

Any suggested solution?

Hello, delete this code to solve this error. ( line 5, sys.path.remove('/opt/ros/kinetic/lib/python2.7/dist-packages'))

NaifahNurya commented 3 years ago

@Double-zh Thank you for reply, I tried this before but It started to train with the fist epoch, then gives cudnn error as shown below

` return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2021-12-03 19:37:20 | INFO | yolox.core.trainer:130 - Model Summary: Params: 8.94M, Gflops: 26.64 2021-12-03 19:37:20 | INFO | yolox.core.trainer:289 - loading checkpoint for fine tuning 2021-12-03 19:37:20 | WARNING | yolox.utils.checkpoint:25 - Shape of head.cls_preds.0.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.0.weight in model is torch.Size([2, 128, 1, 1]). 2021-12-03 19:37:20 | WARNING | yolox.utils.checkpoint:25 - Shape of head.cls_preds.0.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.0.bias in model is torch.Size([2]). 2021-12-03 19:37:20 | WARNING | yolox.utils.checkpoint:25 - Shape of head.cls_preds.1.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.1.weight in model is torch.Size([2, 128, 1, 1]). 2021-12-03 19:37:20 | WARNING | yolox.utils.checkpoint:25 - Shape of head.cls_preds.1.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.1.bias in model is torch.Size([2]). 2021-12-03 19:37:20 | WARNING | yolox.utils.checkpoint:25 - Shape of head.cls_preds.2.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.2.weight in model is torch.Size([2, 128, 1, 1]). 2021-12-03 19:37:20 | WARNING | yolox.utils.checkpoint:25 - Shape of head.cls_preds.2.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.2.bias in model is torch.Size([2]). 2021-12-03 19:37:22 | INFO | yolox.core.trainer:148 - init prefetcher, this might take one minute or less... 2021-12-03 19:37:28 | INFO | yolox.core.trainer:176 - Training start... 2021-12-03 19:37:28 | INFO | yolox.core.trainer:187 - ---> start train epoch1 2021-12-03 19:37:29 | INFO | yolox.core.trainer:180 - Training of experiment is done and the best AP is 0.00 2021-12-03 19:37:29 | ERROR | yolox.core.launch:90 - An error has been caught in function 'launch', process 'MainProcess' (12976), thread 'MainThread' (9988):

RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

import torch torch.backends.cuda.matmul.allow_tf32 = True torch.backends.cudnn.benchmark = True torch.backends.cudnn.deterministic = False torch.backends.cudnn.allow_tf32 = True data = torch.randn([1, 12, 320, 320], dtype=torch.half, device='cuda', requires_grad=True) net = torch.nn.Conv2d(12, 32, kernel_size=[3, 3], padding=[1, 1], stride=[1, 1], dilation=[1, 1], groups=1) net = net.cuda().half() out = net(data) out.backward(torch.randn_like(out)) torch.cuda.synchronize()

ConvolutionParams data_type = CUDNN_DATA_HALF padding = [1, 1, 0] stride = [1, 1, 0] dilation = [1, 1, 0] groups = 1 deterministic = false allow_tf32 = true input: TensorDescriptor 0000012EE6167630 type = CUDNN_DATA_HALF nbDims = 4 dimA = 1, 12, 320, 320, strideA = 1228800, 102400, 320, 1, output: TensorDescriptor 0000012EE61676A0 type = CUDNN_DATA_HALF nbDims = 4 dimA = 1, 32, 320, 320, strideA = 3276800, 102400, 320, 1, weight: FilterDescriptor 0000012EE0679EE0 type = CUDNN_DATA_HALF tensor_format = CUDNN_TENSOR_NCHW nbDims = 4 dimA = 32, 12, 3, 3, Pointer addresses: input: 0000000B10718000 output: 0000000B10970000 weight: 0000000B113FDC00

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\admin\anaconda3\envs\ByteTZH\lib\site-packages\loguru_handler.py", line 177, in emit self._sink.write(str_record) File "C:\Users\admin\anaconda3\envs\ByteTZH\lib\site-packages\loguru_file_sink.py", line 176, in write self._file.write(message) UnicodeEncodeError: 'cp949' codec can't encode character '\u2552' in position 475: illegal multibyte sequence --- End of logging error ---

` The following is the collected environment information.

`

python -m torch.utils.collect_env Collecting environment information... PyTorch version: 1.8.0 Is debug build: False CUDA used to build PyTorch: 11.1 ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Proffessional GCC version: Could not collect Clang version: Could not collect CMake version: Could not collect

Python version: 3.6 (64-bit runtime) Is CUDA available: True CUDA runtime version: 10.2.89 GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080 Nvidia driver version: 471.41 cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\cudnn64_8.dll HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.19.5 [pip3] torch==1.8.0 [pip3] torchaudio==0.8.0 [pip3] torchvision==0.9.0 [conda] blas 2.112 mkl conda-forge [conda] blas-devel 3.9.0 12_win64_mkl conda-forge [conda] cudatoolkit 11.1.1 heb2d755_9 conda-forge [conda] libblas 3.9.0 12_win64_mkl conda-forge [conda] libcblas 3.9.0 12_win64_mkl conda-forge [conda] liblapack 3.9.0 12_win64_mkl conda-forge [conda] liblapacke 3.9.0 12_win64_mkl conda-forge [conda] mkl 2021.4.0 h0e2418a_729 conda-forge [conda] mkl-devel 2021.4.0 h57928b3_730 conda-forge [conda] mkl-include 2021.4.0 h0e2418a_729 conda-forge [conda] numpy 1.19.5 py36h4b40d73_2 conda-forge [conda] pytorch 1.8.0 py3.6_cuda11.1_cudnn8_0 pytorch [conda] torchaudio 0.8.0 py36 pytorch [conda] torchvision 0.9.0 py36_cu111 pytorch

`

If there is any information/suggestion will be helpful.

Thank you