stevenwudi / Kaggle_PKU_Baidu

Kaggle_PKU_Baidu
Apache License 2.0
71 stars 11 forks source link

单GPU训练产生错误 #64

Open enhenghengheng opened 4 years ago

enhenghengheng commented 4 years ago

您好,又来给您添麻烦了,我又遇到了两个问题

  1. 我已经将Configurations和Dataset setup(使用的是您提供的kaggle_apollo_combined_6691_origin.json)都已经配置好 由于我只有单GPU所以,Running 单GPU-train code (python train_kaggle_pku.py),但是产生错误

    
    2020-03-14 23:07:57,375 - INFO - Distributed training: False
    16%|█▋        | 13/79 [00:00<00:00, 127.31it/s]Loading Car model files...
    100%|██████████| 79/79 [00:00<00:00, 122.13it/s]
    Totaly corrupted count is: 1240, clean count: 74029
    Total images: 6691, car num sum: 74029, minmin: 1, max: 43, mean: 11
    x min: -851, max: 4116, mean: 1535
    y min: 1482, max: 3427, mean: 1821
    x min: -79, max: 79, mean: -3, std: 13.622
    y min: 1, max: 42, mean: 9, std: 4.747
    z min: 3, max: 150, mean: 50, std: 29.950
    Car model: max: 76, min: 2, total: 74029
    Unique car models:
    [ 2  6  7  8  9 12 14 16 18 19 20 23 25 27 28 31 32 35 37 40 43 46 47 48
    50 51 54 56 60 61 66 70 71 76]
    Number of unique car models: 34
    0%|          | 0/79 [00:00<?, ?it/s]validation_images
    Loading Car model files...
    100%|██████████| 79/79 [00:00<00:00, 119.03it/s]
    2020-03-14 23:08:03,864 - INFO - Start running, host: shi@shi-Lenovo-Legion-Y7000P-1060, work_dir: /home/shi/Kaggle/checkpoints/Mar14-23-07
    2020-03-14 23:08:03,864 - INFO - workflow: [('train', 1)], max: 200 epochs
    /home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
    _np_qint8 = np.dtype([("qint8", np.int8, 1)])
    /home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
    _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
    /home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
    _np_qint16 = np.dtype([("qint16", np.int16, 1)])
    /home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
    _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
    /home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
    _np_qint32 = np.dtype([("qint32", np.int32, 1)])
    /home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
    np_resource = np.dtype([("resource", np.ubyte, 1)])
    Traceback(most recent call last):
    File "<ipython-input-1-3c5e8e6cf921>", line 1, in <module>
    runfile('/home/shi/Kaggle/tools/train_kaggle_pku.py', wdir='/home/shi/Kaggle/tools')
    File "/home/shi/anaconda3/lib/python3.7/site-
    packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)
    File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
    File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 100, in <module>
    main()
    File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 96, in main
    logger=logger)
    File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 79, in train_detector
    _non_dist_train(model, dataset, cfg, validate=validate)
    File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 257, in _non_dist_train
    runner.run(data_loaders, cfg.workflow, `cfg.total_epochs)`
    File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 351, in run
    self.call_hook('before_run')
    
    File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 238, in call_hook
    getattr(hook, fn_name)(self)
    
    File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 74, in wrapper
    return func(*args, **kwargs)
    
    File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 28, in before_run
    'Please run "pip install future tensorboard" to install '

ImportError: Please run "pip install future tensorboard" to install the dependencies to use torch.utils.tensorboard (applicable to PyTorch 1.1 or higher)



我按照提示,pip install future tensorboard之后,还是会报这个错误,所以没办法了,只能求助您了。

2. 
我还有一个问题,就是单GPU train 代码`python train_kaggle_pku.py`不支持验证评估,那我要是想用多GPU train 代码
`CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch`
在只有单GPU的机器上train,并进行验证评估,我要怎么更改这行代码呢?`CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch`

希望在您百忙之中能给予解答,谢谢。
stevenwudi commented 4 years ago
  1. tensorboard 安装错误可能需要你自己巡查mmdet或者pytorch的安装文件了。或者你自己查看stackoverflow了, 这个错误我们没有遇到过。

2.- validation is only written for distributed training. If you only have a single gpu, you can do something like: CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 train_kaggle_pku.py --launcher pytorch

enhenghengheng commented 4 years ago

好的,十分感谢您的解答。 这个错误可能和我安装的mmdet的版本有问题,我使用的是您提供的setup.py文件,直接python setup.py install,然后就显示mmdet 1.0rc0+unknown,对于如何安装正确的 mmdet:1.0.rc0+d3ca926 您有什么好的建议吗?

cyh1112 commented 4 years ago

@enhenghengheng 能不能提供一下你的环境信息,包含 Liunx, Python, pytorch, mmcv, cuda, cudnn, gcc等

enhenghengheng commented 4 years ago

Hi,我的环境信息是这样的 Linux:16.04 Python 3.7 Torch1.3.1 Torchvision0.4.2 Mmcv0.4.0 Cuda10.0 Cudnn7.5.0 Gcc7.3.0 不知道我这样的环境是否可以正常运行python train_kaggle_pku.py

cyh1112 commented 4 years ago

@enhenghengheng mmcv 换成0.2.14,然后重新安装future和tensorboard试试

enhenghengheng commented 4 years ago

好的,我试一下,谢谢您。

enhenghengheng commented 4 years ago

@cyh1112 您好,我已经安装完mmcv==0.2.14和重新安装的future和tensorboard,并且使用了昨天新发布的权重文件 Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth ,但是mmdet版本的问题依然存在,pip list 之后还是 mmdet-1.0rc0+unknown ,并且这次又产生了其他的错误

`2020-03-18 15:19:40,343 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.52it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.36it/s] Totaly corrupted count is: 1240, clean count: 74029 Total images: 6691, car num sum: 74029, minmin: 1, max: 43, mean: 11 x min: -851, max: 4116, mean: 1535 y min: 1482, max: 3427, mean: 1821 x min: -79, max: 79, mean: -3, std: 13.622 y min: 1, max: 42, mean: 9, std: 4.747 z min: 3, max: 150, mean: 50, std: 29.950 Car model: max: 76, min: 2, total: 74029 Unique car models: [ 2 6 7 8 9 12 14 16 18 19 20 23 25 27 28 31 32 35 37 40 43 46 47 48 50 51 54 56 60 61 66 70 71 76] Number of unique car models: 34 0%| | 0/79 [00:00<?, ?it/s]validation_images Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 110.98it/s] 2020-03-18 15:19:46,944 - INFO - load checkpoint from /home/shi/data/Kaggle_pku/checkpoints/Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth 2020-03-18 15:19:47,796 - WARNING - The model and loaded state dict do not match exactly

these keys have mismatched shape: +------------------------------------+----------------------+-------------------------+ | key | expected shape | loaded shape | +------------------------------------+----------------------+-------------------------+ | translation_head.trans_pred.weight | torch.Size([3, 200]) | torch.Size([1629, 200]) | | translation_head.trans_pred.bias | torch.Size([3]) | torch.Size([1629]) | +------------------------------------+----------------------+-------------------------+ 2020-03-18 15:19:47,797 - INFO - resumed epoch 261, iter 324280 2020-03-18 15:19:47,799 - INFO - Start running, host: shi@shi-Lenovo-Legion-Y7000P-1060, work_dir: /home/shi/data/Kaggle_pku/checkpoints/Mar18-15-19 2020-03-18 15:19:47,799 - INFO - workflow: [('train', 1)], max: 300 epochs /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 96, in main logger=logger)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 79, in train_detector _non_dist_train(model, dataset, cfg, validate=validate)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 257, in _non_dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 358, in run epoch_runner(data_loaders[i], **kwargs)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 260, in train for i, data_batch in enumerate(data_loader):

File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next return self._process_data(data)

File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise()

File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise raise self.exc_type(msg)

FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 130, in getitem data = self.prepare_train_img(idx) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 145, in prepare_train_img return self.pipeline(results) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/pipelines/compose.py", line 24, in call data = t(data) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 23, in call img = mmcv.imread(filename) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/image/io.py", line 41, in imread 'img file does not exist: {}'.format(img_or_path)) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/utils/path.py", line 32, in check_file_exist raise FileNotFoundError(msg_tmpl.format(filename)) FileNotFoundError: img file does not exist: /data/Kaggle/pku-autonomous-driving/train_images/ID_8815c1b0b.jpg`

我检查了 /data/Kaggle/pku-autonomous-driving/train_images/文件夹,其中有ID_8815c1b0b.jpg这张图片,并不是他报错的缺失,希望您能给予解答,并恭喜你们团队,成功在Arxiv上发表论文,我也将仔细研读该论文。

cyh1112 commented 4 years ago

@enhenghengheng 从这个错误信息来看,环境应该是没问题了,新的报错就是图片找不到,请好好检查一下路径

enhenghengheng commented 4 years ago

您说的路径是在configs/htc/htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.py中设置的图片读取路径吗?

enhenghengheng commented 4 years ago

并且每次运行train-kaggle-pku.py之后,报错的缺失的图片都不相同,刚才我又run了一次,这次缺的图片是这个 FileNotFoundError: img file does not exist: /data/Kaggle/ApolloScape_3D_car/train/images/180118_070826217_Camera_5.jpg

enhenghengheng commented 4 years ago

我怀疑产生错误的原因是和这块的kaggle_apollo_combined_6691_origin.json文件有关 `data_root = '/home/shi/data/Kaggle/pku-autonomous-driving/'

data_root = '/data/Kaggle/ApolloScape_3D_car/train/'

data = dict( imgs_per_gpu=1, workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root,

ann_file='/data/cyh/kaggle/kaggle_apollo_combine_6692.json',

    # ann_file=data_root + 'apollo_kaggle_combined_6725_wudi.json',
    ann_file='/home/shi/data/Kaggle/kaggle_apollo_combined_6691_origin.json',  # 6691 means the final cleaned data
    img_prefix=data_root + 'train_images/',
    pipeline=train_pipeline,
    rotation_augmenation=True),`

因为在配置文件中,就这里涉及到训练文件的路径,我也想尝试不使用kaggle_apollo_combined_6691_origin.json文件,但是我在kaggle提供的数据中找不到train.json文件,如果您有该文件,可否提供一下,我再做尝试。。

cyh1112 commented 4 years ago

@enhenghengheng
apollo_kaggle_combined_6725_wudi.json是我们自己生成的, json里面保存了图片的绝对路径:/data/Kaggle/pku-autonomous-driving/..../X.jpg, 而你的实际数据存放路径是 /home/shi/data/Kaggle/pku-autonomous-driving/..../X.jpg, 所以肯定找不到图片,你可以通过三种方法来修复:

enhenghengheng commented 4 years ago

好的,谢谢您,终于找到问题的所在了,我再尝试修改一下。

enhenghengheng commented 4 years ago

@cyh1112 您好,非常感谢您耐心的帮助解决问题,以上您提供的三种方法中我使用了第二种和第三种,其中,第三种方法更改路径之后,会提示RuntimeError: CUDA outof memory,我通过更改图片的尺寸,就可以正常的训练。

第二种方法,我按照您所说,在‘../configs/htc/htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.py’文件中

data_root = '/data/Kaggle/pku-autonomous-driving/'将data_root换成我自己的路径 data_root = '/home/shi/data/Kaggle/pku-autonomous-driving/‘

data = dict( imgs_per_gpu=1, workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root,

    ann_file='/home/shi/data/Kaggle/kaggle_apollo_combined_6691_origin.json', 将json文件删除
    ann_file='/home/shi/data/Kaggle/'

    img_prefix=data_root + 'train_images/',
    pipeline=train_pipeline,
    rotation_augmenation=True)`

然后从新运行’train_kaggle_pku.py‘,但是没有产生我自己的json文件,并且还报错了 ’ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.‘‘

并且我在’mmdet/dataset/kaggle_pku.py‘文件中我也已经更改好我自己的路径,如下:

’@DATASETS.register_module class KagglePKUDataset(CustomDataset): CLASSES = ('car',)

def load_annotations(self, ann_file, outdir='/home/shi/data/Kaggle/pku-autonomous-driving/'):

‘ 但是还是没有产生我自己的json文件,不知道哪里弄错了,所以还需要您能帮忙解答一下,万分感谢。

cyh1112 commented 4 years ago
'ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine=python'

请贴出完整的错误信息 @enhenghengheng

enhenghengheng commented 4 years ago

好的,这个是运行train_kaggle_pku.py的错误 `2020-03-19 17:33:36,177 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.35it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.54it/s] Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main datasets = [build_dataset(cfg.data.train)]

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg return obj_cls(**args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init self.img_infos = self.load_annotations(self.ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 79, in load_annotations train = pd.read_csv(ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f return _read(filepath_or_buffer, kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read parser = TextFileReader(fp_or_buf, **kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine self._engine = CParserWrapper(self.f, **self.options)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1917, in init self._reader = parsers.TextReader(src, **kwds)

File "pandas/_libs/parsers.pyx", line 542, in pandas._libs.parsers.TextReader.cinit

File "pandas/_libs/parsers.pyx", line 735, in pandas._libs.parsers.TextReader._get_header

File "pandas/_libs/parsers.pyx", line 937, in pandas._libs.parsers.TextReader._tokenize_rows

File "pandas/_libs/parsers.pyx", line 2132, in pandas._libs.parsers.raise_parser_error

ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.`

cyh1112 commented 4 years ago

好的,这个是运行train_kaggle_pku.py的错误 `2020-03-19 17:33:36,177 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.35it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.54it/s] Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main datasets = [build_dataset(cfg.data.train)]

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg return obj_cls(**args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init self.img_infos = self.load_annotations(self.ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 79, in load_annotations train = pd.read_csv(ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f return _read(filepath_or_buffer, kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read parser = TextFileReader(fp_or_buf, **kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine self._engine = CParserWrapper(self.f, **self.options)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1917, in init self._reader = parsers.TextReader(src, **kwds)

File "pandas/_libs/parsers.pyx", line 542, in pandas._libs.parsers.TextReader.cinit

File "pandas/_libs/parsers.pyx", line 735, in pandas._libs.parsers.TextReader._get_header

File "pandas/_libs/parsers.pyx", line 937, in pandas._libs.parsers.TextReader._tokenize_rows

File "pandas/_libs/parsers.pyx", line 2132, in pandas._libs.parsers.raise_parser_error

ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.`

@enhenghengheng

val=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='/data/Kaggle/pku-autonomous-driving/validation.csv',
        img_prefix='/data/Kaggle/pku-autonomous-driving/validation_images/',
        pipeline=test_pipeline),

请按照上面的示例修改你的config.data.train, 其中ann_file为对应的csv, img_prefix为对应图片路径 详见:htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.config

enhenghengheng commented 4 years ago

好的,我立刻更改过来

enhenghengheng commented 4 years ago

@cyh1112 您好,我将ann_file为对应的csv更改为: ann_file='/home/shi/data/Kaggle/pku-autonomous-driving/train.csv' img_prefix按照 htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.config也更改为: img_prefix=data_root + 'train_images/' 但是又报错了 ’2020-03-19 17:50:15,088 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.80it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.18it/s] Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main datasets = [build_dataset(cfg.data.train)]

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg return obj_cls(**args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init self.img_infos = self.load_annotations(self.ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 75, in load_annotations annotations = json.load(open(outfile, 'r'))

File "/home/shi/anaconda3/lib/python3.7/json/init.py", line 296, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

File "/home/shi/anaconda3/lib/python3.7/json/init.py", line 348, in loads return _default_decoder.decode(s)

File "/home/shi/anaconda3/lib/python3.7/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File "/home/shi/anaconda3/lib/python3.7/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting value‘