enhenghengheng commented 4 years ago

您好，又来给您添麻烦了，我又遇到了两个问题

我已经将Configurations和Dataset setup（使用的是您提供的kaggle_apollo_combined_6691_origin.json）都已经配置好由于我只有单GPU所以，Running 单GPU-train code （python train_kaggle_pku.py），但是产生错误


2020-03-14 23:07:57,375 - INFO - Distributed training: False
16%|█▋        | 13/79 [00:00<00:00, 127.31it/s]Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 122.13it/s]
Totaly corrupted count is: 1240, clean count: 74029
Total images: 6691, car num sum: 74029, minmin: 1, max: 43, mean: 11
x min: -851, max: 4116, mean: 1535
y min: 1482, max: 3427, mean: 1821
x min: -79, max: 79, mean: -3, std: 13.622
y min: 1, max: 42, mean: 9, std: 4.747
z min: 3, max: 150, mean: 50, std: 29.950
Car model: max: 76, min: 2, total: 74029
Unique car models:
[ 2  6  7  8  9 12 14 16 18 19 20 23 25 27 28 31 32 35 37 40 43 46 47 48
50 51 54 56 60 61 66 70 71 76]
Number of unique car models: 34
0%|          | 0/79 [00:00<?, ?it/s]validation_images
Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 119.03it/s]
2020-03-14 23:08:03,864 - INFO - Start running, host: shi@shi-Lenovo-Legion-Y7000P-1060, work_dir: /home/shi/Kaggle/checkpoints/Mar14-23-07
2020-03-14 23:08:03,864 - INFO - workflow: [('train', 1)], max: 200 epochs
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Traceback(most recent call last):
File "<ipython-input-1-3c5e8e6cf921>", line 1, in <module>
runfile('/home/shi/Kaggle/tools/train_kaggle_pku.py', wdir='/home/shi/Kaggle/tools')
File "/home/shi/anaconda3/lib/python3.7/site-
packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 100, in <module>
main()
File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 96, in main
logger=logger)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 79, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 257, in _non_dist_train
runner.run(data_loaders, cfg.workflow, `cfg.total_epochs)`
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 351, in run
self.call_hook('before_run')

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 238, in call_hook
getattr(hook, fn_name)(self)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 74, in wrapper
return func(*args, **kwargs)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 28, in before_run
'Please run "pip install future tensorboard" to install '

ImportError: Please run "pip install future tensorboard" to install the dependencies to use torch.utils.tensorboard (applicable to PyTorch 1.1 or higher)



我按照提示，pip install future tensorboard之后，还是会报这个错误，所以没办法了，只能求助您了。

2. 
我还有一个问题，就是单GPU train 代码`python train_kaggle_pku.py`不支持验证评估，那我要是想用多GPU train 代码
`CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch`
在只有单GPU的机器上train，并进行验证评估，我要怎么更改这行代码呢？`CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch`

希望在您百忙之中能给予解答，谢谢。

stevenwudi commented 4 years ago

tensorboard 安装错误可能需要你自己巡查mmdet或者pytorch的安装文件了。或者你自己查看stackoverflow了，这个错误我们没有遇到过。

2.- validation is only written for distributed training. If you only have a single gpu, you can do something like: CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 train_kaggle_pku.py --launcher pytorch

enhenghengheng commented 4 years ago

好的，十分感谢您的解答。这个错误可能和我安装的mmdet的版本有问题，我使用的是您提供的setup.py文件，直接python setup.py install,然后就显示mmdet 1.0rc0+unknown，对于如何安装正确的 mmdet:1.0.rc0+d3ca926 您有什么好的建议吗？

cyh1112 commented 4 years ago

@enhenghengheng 能不能提供一下你的环境信息，包含 Liunx, Python, pytorch, mmcv, cuda, cudnn, gcc等

enhenghengheng commented 4 years ago

Hi,我的环境信息是这样的 Linux:16.04 Python 3.7 Torch1.3.1 Torchvision0.4.2 Mmcv0.4.0 Cuda10.0 Cudnn7.5.0 Gcc7.3.0 不知道我这样的环境是否可以正常运行python train_kaggle_pku.py

cyh1112 commented 4 years ago

@enhenghengheng mmcv 换成0.2.14，然后重新安装future和tensorboard试试

enhenghengheng commented 4 years ago

好的，我试一下，谢谢您。

enhenghengheng commented 4 years ago

@cyh1112 您好，我已经安装完mmcv==0.2.14和重新安装的future和tensorboard，并且使用了昨天新发布的权重文件 Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth ，但是mmdet版本的问题依然存在，pip list 之后还是 mmdet-1.0rc0+unknown ，并且这次又产生了其他的错误

`2020-03-18 15:19:40,343 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.52it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.36it/s] Totaly corrupted count is: 1240, clean count: 74029 Total images: 6691, car num sum: 74029, minmin: 1, max: 43, mean: 11 x min: -851, max: 4116, mean: 1535 y min: 1482, max: 3427, mean: 1821 x min: -79, max: 79, mean: -3, std: 13.622 y min: 1, max: 42, mean: 9, std: 4.747 z min: 3, max: 150, mean: 50, std: 29.950 Car model: max: 76, min: 2, total: 74029 Unique car models: [ 2 6 7 8 9 12 14 16 18 19 20 23 25 27 28 31 32 35 37 40 43 46 47 48 50 51 54 56 60 61 66 70 71 76] Number of unique car models: 34 0%| | 0/79 [00:00<?, ?it/s]validation_images Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 110.98it/s] 2020-03-18 15:19:46,944 - INFO - load checkpoint from /home/shi/data/Kaggle_pku/checkpoints/Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth 2020-03-18 15:19:47,796 - WARNING - The model and loaded state dict do not match exactly

these keys have mismatched shape: +------------------------------------+----------------------+-------------------------+ | key | expected shape | loaded shape | +------------------------------------+----------------------+-------------------------+ | translation_head.trans_pred.weight | torch.Size([3, 200]) | torch.Size([1629, 200]) | | translation_head.trans_pred.bias | torch.Size([3]) | torch.Size([1629]) | +------------------------------------+----------------------+-------------------------+ 2020-03-18 15:19:47,797 - INFO - resumed epoch 261, iter 324280 2020-03-18 15:19:47,799 - INFO - Start running, host: shi@shi-Lenovo-Legion-Y7000P-1060, work_dir: /home/shi/data/Kaggle_pku/checkpoints/Mar18-15-19 2020-03-18 15:19:47,799 - INFO - workflow: [('train', 1)], max: 300 epochs /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/shi/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 96, in main logger=logger)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 79, in train_detector _non_dist_train(model, dataset, cfg, validate=validate)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 257, in _non_dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 358, in run epoch_runner(data_loaders[i], **kwargs)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 260, in train for i, data_batch in enumerate(data_loader):

File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next return self._process_data(data)

File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise()

File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise raise self.exc_type(msg)

FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/shi/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 130, in getitem data = self.prepare_train_img(idx) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 145, in prepare_train_img return self.pipeline(results) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/pipelines/compose.py", line 24, in call data = t(data) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 23, in call img = mmcv.imread(filename) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/image/io.py", line 41, in imread 'img file does not exist: {}'.format(img_or_path)) File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/utils/path.py", line 32, in check_file_exist raise FileNotFoundError(msg_tmpl.format(filename)) FileNotFoundError: img file does not exist: /data/Kaggle/pku-autonomous-driving/train_images/ID_8815c1b0b.jpg`

我检查了 /data/Kaggle/pku-autonomous-driving/train_images/文件夹，其中有ID_8815c1b0b.jpg这张图片，并不是他报错的缺失，希望您能给予解答，并恭喜你们团队，成功在Arxiv上发表论文，我也将仔细研读该论文。

cyh1112 commented 4 years ago

@enhenghengheng 从这个错误信息来看，环境应该是没问题了，新的报错就是图片找不到，请好好检查一下路径

enhenghengheng commented 4 years ago

您说的路径是在configs/htc/htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.py中设置的图片读取路径吗？

enhenghengheng commented 4 years ago

并且每次运行train-kaggle-pku.py之后，报错的缺失的图片都不相同，刚才我又run了一次，这次缺的图片是这个 FileNotFoundError: img file does not exist: /data/Kaggle/ApolloScape_3D_car/train/images/180118_070826217_Camera_5.jpg

enhenghengheng commented 4 years ago

我怀疑产生错误的原因是和这块的kaggle_apollo_combined_6691_origin.json文件有关 `data_root = '/home/shi/data/Kaggle/pku-autonomous-driving/'

data_root = '/data/Kaggle/ApolloScape_3D_car/train/'

data = dict( imgs_per_gpu=1, workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root,

ann_file='/data/cyh/kaggle/kaggle_apollo_combine_6692.json',

    # ann_file=data_root + 'apollo_kaggle_combined_6725_wudi.json',
    ann_file='/home/shi/data/Kaggle/kaggle_apollo_combined_6691_origin.json',  # 6691 means the final cleaned data
    img_prefix=data_root + 'train_images/',
    pipeline=train_pipeline,
    rotation_augmenation=True),`

因为在配置文件中，就这里涉及到训练文件的路径，我也想尝试不使用kaggle_apollo_combined_6691_origin.json文件，但是我在kaggle提供的数据中找不到train.json文件，如果您有该文件，可否提供一下，我再做尝试。。

cyh1112 commented 4 years ago

@enhenghengheng
apollo_kaggle_combined_6725_wudi.json是我们自己生成的， json里面保存了图片的绝对路径：/data/Kaggle/pku-autonomous-driving/..../X.jpg, 而你的实际数据存放路径是 /home/shi/data/Kaggle/pku-autonomous-driving/..../X.jpg, 所以肯定找不到图片，你可以通过三种方法来修复：

1.遍历修改json文件里面的每一张图片路径为你正确的路径
2.删除json文件，在config文件中，将data_root修改成你正确的路径，然后重新run, 这将会重新生成json文件，通过这种方式分别生成kaggle和apollo的json，再将两个json合并成一个
3.将你的数据移动到跟json文件里面一致的路径 Kaggle: /data/Kaggle/pku-autonomous-driving/ Apollo: /data/Kaggle/ApolloScape_3D_car/

enhenghengheng commented 4 years ago

好的，谢谢您，终于找到问题的所在了，我再尝试修改一下。

enhenghengheng commented 4 years ago

@cyh1112 您好，非常感谢您耐心的帮助解决问题，以上您提供的三种方法中我使用了第二种和第三种，其中，第三种方法更改路径之后，会提示RuntimeError: CUDA outof memory,我通过更改图片的尺寸，就可以正常的训练。

第二种方法，我按照您所说，在‘../configs/htc/htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.py’文件中

’data_root = '/data/Kaggle/pku-autonomous-driving/'将data_root换成我自己的路径 data_root = '/home/shi/data/Kaggle/pku-autonomous-driving/‘

data = dict( imgs_per_gpu=1, workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root,

    ann_file='/home/shi/data/Kaggle/kaggle_apollo_combined_6691_origin.json', 将json文件删除
    ann_file='/home/shi/data/Kaggle/'

    img_prefix=data_root + 'train_images/',
    pipeline=train_pipeline,
    rotation_augmenation=True)`

然后从新运行’train_kaggle_pku.py‘，但是没有产生我自己的json文件，并且还报错了 ’ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.‘‘

并且我在’mmdet/dataset/kaggle_pku.py‘文件中我也已经更改好我自己的路径，如下：

’@DATASETS.register_module class KagglePKUDataset(CustomDataset): CLASSES = ('car',)

def load_annotations(self, ann_file, outdir='/home/shi/data/Kaggle/pku-autonomous-driving/'):

‘ 但是还是没有产生我自己的json文件，不知道哪里弄错了，所以还需要您能帮忙解答一下，万分感谢。

cyh1112 commented 4 years ago

'ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine=python'

请贴出完整的错误信息 @enhenghengheng

enhenghengheng commented 4 years ago

好的，这个是运行train_kaggle_pku.py的错误 `2020-03-19 17:33:36,177 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.35it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.54it/s] Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main datasets = [build_dataset(cfg.data.train)]

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg return obj_cls(**args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init self.img_infos = self.load_annotations(self.ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 79, in load_annotations train = pd.read_csv(ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f return _read(filepath_or_buffer, kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read parser = TextFileReader(fp_or_buf, **kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine self._engine = CParserWrapper(self.f, **self.options)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1917, in init self._reader = parsers.TextReader(src, **kwds)

File "pandas/_libs/parsers.pyx", line 542, in pandas._libs.parsers.TextReader.cinit

File "pandas/_libs/parsers.pyx", line 735, in pandas._libs.parsers.TextReader._get_header

File "pandas/_libs/parsers.pyx", line 937, in pandas._libs.parsers.TextReader._tokenize_rows

File "pandas/_libs/parsers.pyx", line 2132, in pandas._libs.parsers.raise_parser_error

ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.`

cyh1112 commented 4 years ago

好的，这个是运行train_kaggle_pku.py的错误 `2020-03-19 17:33:36,177 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.35it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.54it/s] Traceback (most recent call last):

File "", line 1, in runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main datasets = [build_dataset(cfg.data.train)]

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg return obj_cls(**args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init self.img_infos = self.load_annotations(self.ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 79, in load_annotations train = pd.read_csv(ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f return _read(filepath_or_buffer, kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read parser = TextFileReader(fp_or_buf, **kwds)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine self._engine = CParserWrapper(self.f, **self.options)

File "/home/shi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1917, in init self._reader = parsers.TextReader(src, **kwds)

File "pandas/_libs/parsers.pyx", line 542, in pandas._libs.parsers.TextReader.cinit

File "pandas/_libs/parsers.pyx", line 735, in pandas._libs.parsers.TextReader._get_header

File "pandas/_libs/parsers.pyx", line 937, in pandas._libs.parsers.TextReader._tokenize_rows

File "pandas/_libs/parsers.pyx", line 2132, in pandas._libs.parsers.raise_parser_error

ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.`

@enhenghengheng

val=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='/data/Kaggle/pku-autonomous-driving/validation.csv',
        img_prefix='/data/Kaggle/pku-autonomous-driving/validation_images/',
        pipeline=test_pipeline),

请按照上面的示例修改你的config.data.train, 其中ann_file为对应的csv， img_prefix为对应图片路径详见：htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.config

enhenghengheng commented 4 years ago

好的，我立刻更改过来

enhenghengheng commented 4 years ago

@cyh1112 您好，我将ann_file为对应的csv更改为： ann_file='/home/shi/data/Kaggle/pku-autonomous-driving/train.csv' img_prefix按照 htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.config也更改为： img_prefix=data_root + 'train_images/' 但是又报错了 ’2020-03-19 17:50:15,088 - INFO - Distributed training: False 14%|█▍ | 11/79 [00:00<00:00, 74.80it/s]Loading Car model files... 100%|██████████| 79/79 [00:00<00:00, 108.18it/s] Traceback (most recent call last):