zhihou7 / HOI-CL

Series of work (ECCV2020, CVPR2021, CVPR2021, ECCV2022) about Compositional Learning for Human-Object Interaction Exploration
https://sites.google.com/view/hoi-cl
MIT License
78 stars 11 forks source link

File Not Found error #12

Open anjugopinath opened 3 years ago

anjugopinath commented 3 years ago

I was training ATL on my own dataset. I stopped training midway and tried restarting it again. But, I am getting the below error:

Traceback (most recent call last): File "affordance/HOI-CL/tools/Train_ATL_HICO.py", line 210, in sw.train_model(sess, args.max_iters) File "affordance/HOI-CL/tools/../lib/models/train_Solver_HICO_MultiBatch.py", line 144, in train_model self.from_snapshot(sess) File "affordance/HOI-CL/tools/../lib/models/train_Solver_HICO.py", line 174, in from_snapshot saver.restore(sess, self.switch_checkpoint_path(ckpt.model_checkpoint_path))

tensorflow.python.framework.errors_impl.NotFoundError: affordance/HOI-CL/Data/Weights/ATL_union_batch1_semi_l2_def4_vloss2_rew2_aug5_3_x5new_coco_res101;

anjugopinath commented 3 years ago

1st Question)

The folder Weights does not exist inside Data folder. But, logs folder exists.

Weights folder exists inside HOI-CL/Weights/

Inside train_Solver_HICO.py, when I print "ckpt.model_checkpoint_path" before line 17, the output is "HOI-CL/Weights/ATL_union_batch1_semi_l2_def4_vloss2_rew2_aug5_3_x5new_coco_res101".

But, an error is thrown at line 174: "tensorflow.python.framework.errors_impl.NotFoundError: affordance/HOI-CL/Data/Weights/ATL_union_batch1_semi_l2_def4_vloss2_rew2_aug5_3_x5new_coco_res101"

The second path has an additional "Data"

This is because the path is replaced inside function :

def switch_checkpoint_path(self, model_checkpoint_path): head = model_checkpoint_path.split('Weights')[0] model_checkpoint_path = model_checkpoint_path.replace(head, cfg.LOCAL_DATA +'/') return model_checkpoint_path

Why is this done?

Because the path returned by this function does not exist

2nd Question)

When I train ATL for the first time, should the file "res101_faster_rcnn_iter_1190000.ckpt" also be present?

zhihou7 commented 3 years ago

Dear anjugopinath,

switch_checkpoint_path is just for my local env because I have trained the model in different machines and the path is different. You can remove it according to you environment.

Yes, res101_faster_rcnn_iter_1190000.ckpt is download as follow,

mkdir Weights/
python lib/ult/Download_data.py 1IbR4kiWgLF8seaKjOMmwaHs0Bfwl5Dq1 Weights/res50_faster_rcnn_iter_1190000.ckpt.data-00000-of-00001
python lib/ult/Download_data.py 1-DbfEloN4c2JaCEMnexaWAsSc4MDlZJx Weights/res50_faster_rcnn_iter_1190000.ckpt.index
python lib/ult/Download_data.py 1vc5d3OwCtMtRgXq3Pj4_twpK4x3kjgT0 Weights/res50_faster_rcnn_iter_1190000.ckpt.meta

# 
python lib/ult/Download_data.py 0B1_fAEgxdnvJR1N3c1FYRGo1S1U Weights/coco_900-1190k.tgz

cd Weights
tar -xvf coco_900-1190k.tgz
mv coco_2014_train+coco_2014_valminusminival/res101* ./
cd ../