shedy-pub / hlagcn-jittor

Jittor implementation of the paper "Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment"
50 stars 7 forks source link

No such file or directory: './preprocess/preload_AVA.txt' #1

Open axmav opened 2 years ago

axmav commented 2 years ago

Hello! Can not train model on AVA dataset. Get this errors:

error [Errno 2] No such file or directory: '/home/alex/project/datasets/AVA_dataset/images/430454.jpg' /home/alex/project/datasets/AVA_dataset/images/430454.jpg
error [Errno 2] No such file or directory: '/home/alex/project/datasets/AVA_dataset/images/430454.jpg' /home/alex/project/datasets/AVA_dataset/images/430454.jpg
error [Errno 2] No such file or directory: '/home/alex/project/datasets/AVA_dataset/images/430454.jpg' /home/alex/project/datasets/AVA_dataset/images/430454.jpg
error [Errno 2] No such file or directory: '/home/alex/project/datasets/AVA_dataset/images/430454.jpg' /home/alex/project/datasets/AVA_dataset/images/430454.jpg
     AVA dataset info preloaded in ./preprocess/!: #229955 trainval #25553 test
     AVA dataset info preloaded in ./preprocess/!: #229955 trainval #25553 test
     AVA dataset info preloaded in ./preprocess/!: #229955 trainval #25553 test
     AVA dataset info preloaded in ./preprocess/!: #229955 trainval #25553 test
Traceback (most recent call last):
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 237, in <module>
    main()
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 40, in main
    main_worker(args)
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 62, in main_worker
    img_paths_test, img_rates_test, img_cls_test, img_ratios_test = preload_img(params_ava, args.dataroot)
  File "/home/alex/project/hlagcn-jittor/utils_jittor/dataset.py", line 211, in preload_img
    a = open(os.path.join(buffer_root, filename),"w")
FileNotFoundError: [Errno 2] No such file or directory: './preprocess/preload_AVA.txt'
Downloading https://cg.cs.tsinghua.edu.cn/jittor/assets/build/checkpoints/resnet50.pkl to /home/alex/.cache/jittor/jt1.3.1/g++9.3.0/py3.8.12/Linux-5.11.0-3xee/AMDRyzen95900Xx4a/default/cu11.5.50_sm_86/checkpoints/resnet50.pkl
97.7MB [00:23, 4.44MB/s]                            [w 1028 18:07:41.315364 12 __init__.py:1075] load parameter fc.weight failed ...
[w 1028 18:07:41.315427 12 __init__.py:1075] load parameter fc.bias failed ...
[w 1028 18:07:41.315451 12 __init__.py:1093] load total 267 params, 2 failed
[w 1028 18:07:41.322322 00 __init__.py:1075] load parameter fc.weight failed ...
[w 1028 18:07:41.322371 00 __init__.py:1075] load parameter fc.bias failed ...
[w 1028 18:07:41.322398 00 __init__.py:1093] load total 267 params, 2 failed
[w 1028 18:07:41.324841 68 __init__.py:1075] load parameter fc.weight failed ...
[w 1028 18:07:41.324902 68 __init__.py:1075] load parameter fc.bias failed ...
[w 1028 18:07:41.324946 68 __init__.py:1093] load total 267 params, 2 failed

Also some images do not exist. I have chosen AVA2 from data split. Thank you!

axmav commented 2 years ago

I created directory preprocess manually and seems like preload_AVA.txt is created, but got another error:

Downloading https://cg.cs.tsinghua.edu.cn/jittor/assets/build/checkpoints/resnet50.pkl to /home/alex/.cache/jittor/jt1.3.1/g++9.3.0/py3.8.12/Linux-5.11.0-3xee/AMDRyzen95900Xx4a/default/cu11.5.50_sm_86/checkpoints/resnet50.pkl
97.7MB [00:17, 6.02MB/s]                            [w 1028 19:44:09.687694 80 __init__.py:1075] load parameter fc.weight failed ...
[w 1028 19:44:09.687741 80 __init__.py:1075] load parameter fc.bias failed ...
[w 1028 19:44:09.687763 80 __init__.py:1093] load total 267 params, 2 failed

=> Start training #Ep 1 /20
Traceback (most recent call last):
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 237, in <module>
    main()
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 40, in main
    main_worker(args)
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 142, in main_worker
    loss = train(train_loader, model, criterions, optimizer, epoch, args)
  File "/home/alex/project/hlagcn-jittor/utils_jittor/train_jittor.py", line 184, in train
    losses.update(loss.item(), input.size(0))
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.item)).

Types of your inputs are:
 self   = Var,
 args   = (),

The function declarations are:
 ItemData item()

Failed reason:[f 1028 19:25:44.090026 56 helper_cuda.h:126] CUDA error at /home/alex/anaconda3/envs/menv/lib/python3.8/site-packages/jittor/src/var_holder.cc:155  code=700( cudaErrorIllegalAddress ) cudaMemcpy(&data.data, var->mem_ptr, dsize, cudaMemcpyDeviceToHost)
19:25:31->Ep:[1][     0/227954] - Net:5.9 - Load:2.1 - loss_avg:0.180
[e 1028 19:25:44.351267 56 helper_cuda.h:115] Peek CUDA error at /home/alex/anaconda3/envs/menv/lib/python3.8/site-packages/jittor/src/mem/allocator/cuda_dual_allocator.h:101  code=700( cudaErrorIllegalAddress ) _cudaLaunchHostFunc(0, &to_free_allocation, 0)
Exception ignored in: <function Dataset.__del__ at 0x7fcf97857b80>
Traceback (most recent call last):
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/site-packages/jittor/dataset/dataset.py", line 409, in __del__
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/site-packages/jittor/dataset/dataset.py", line 211, in terminate
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/multiprocessing/process.py", line 133, in terminate
  File "/home/alex/anaconda3/envs/menv/lib/python3.8/multiprocessing/popen_fork.py", line 61, in terminate
AttributeError: 'NoneType' object has no attribute 'SIGTERM'
terminate called after throwing an instance of 'std::runtime_error'
  what():  [f 1028 19:25:44.661416 56 helper_cuda.h:126] CUDA error at /home/alex/anaconda3/envs/menv/lib/python3.8/site-packages/jittor/extern/cuda/cudnn/src/cudnn_warper.cc:34  code=4( CUDNN_STATUS_INTERNAL_ERROR ) cudnnDestroy(cudnn_handle)
Aborted (core dumped)