er-muyue / DeFRCN

MIT License
182 stars 43 forks source link

ValueError: could not convert string to float #24

Closed zhengfang1997 closed 2 years ago

zhengfang1997 commented 2 years ago

i tried to bash run_voc.sh:

Traceback (most recent call last):
  File "tools/extract_results.py", line 59, in <module>
    main()
  File "tools/extract_results.py", line 34, in main
    results.append([fid] + [float(x) for x in res_info.split(':')[-1].split(',')])
  File "tools/extract_results.py", line 34, in <listcomp>
    results.append([fid] + [float(x) for x in res_info.split(':')[-1].split(',')])
ValueError: could not convert string to float: ' [Checkpointer] Loading from checkpoints/voc/defrcn/defrcn_det_r101_base1/model_reset_remove.pth ...'

how to solve the problem, thx

zhengfang1997 commented 2 years ago

AssertionError: Checkpoint checkpoints/voc/111/defrcn_det_r101_base1/model_reset_remove.pth not found!

rm: cannot remove 'checkpoints/voc/111/defrcn_fsod_r101_novel1/tfa-like/10shot_seed9/model_final.pth': No such file or directory

salehnia commented 2 years ago

i tried to bash run_voc.sh:

Traceback (most recent call last):
  File "tools/extract_results.py", line 59, in <module>
    main()
  File "tools/extract_results.py", line 34, in main
    results.append([fid] + [float(x) for x in res_info.split(':')[-1].split(',')])
  File "tools/extract_results.py", line 34, in <listcomp>
    results.append([fid] + [float(x) for x in res_info.split(':')[-1].split(',')])
ValueError: could not convert string to float: ' [Checkpointer] Loading from checkpoints/voc/defrcn/defrcn_det_r101_base1/model_reset_remove.pth ...'

how to solve the problem, thx

Hi, Could you fix this error? I have the same Error.

er-muyue commented 2 years ago

Hi, sorry for the late reply. Actually, the last line of the run_*.sh is just a results extraction script, if the above error occurs, it means that the previous process is incorrect (for example, the model training was interrupted unexpectedly, the model was not saved properly or the evaluation failed). Please check the previous log whether is normal or not, including both the base-training stage and the novel-ft stage.

salehnia commented 2 years ago

Hi, sorry for the late reply. Actually, the last line of the run_*.sh is just a results extraction script, if the above error occurs, it means that the previous process is incorrect (for example, the model training was interrupted unexpectedly, the model was not saved properly or the evaluation failed). Please check the previous log whether is normal or not, including both the base-training stage and the novel-ft stage.

Thanks for your reply! Yes, I checked files. I checked log.txt, result on base model are exist, train ans test did correct on it, but in few-shot Novels, evaluation didn't start and last line in log.txt is this: [01/16 13:43:17] d2.data.common INFO: Serialized dataset takes 0.00 MiB

i have CUDA out of memory, my python version is 3.7, detectron2 -> 0.3 and cuda ->10.2

Thanks

suryasid09 commented 1 year ago

Hello,
I have encountered the same problem where CUDA is out of memory. @er-muyue Could you please help?