lindawangg / COVID-Net

COVID-Net Open Source Initiative
Other
1.15k stars 482 forks source link

KeyError: "The name 'norm_dense_1_target:0' refers to a Tensor which does not exist. #140

Closed manjitullal closed 3 years ago

manjitullal commented 3 years ago

I want to retrain covidnet and hence I am using the train_tf script and the model.meta from and checkpoint from below drive link. The problem is when i run the script I get below error. I think the reason this error is because the model. meta isnt right, please advise how to fix this

Error:

KeyError: "The name 'norm_dense_1_target:0' refers to a Tensor which does not exist. The operation, 'norm_dense_1_target', does not exist in the graph."

https://drive.google.com/drive/folders/1eNidqMyz3isLjGYN1evzQu--A-JVkzbk

! python train_tf.py \ --weightspath /content/ \ --metaname model.meta \ --ckptname checkpoint \ --trainfile train_COVIDx7A.txt \ --testfile test_COVIDx7A.txt \

haydengunraj commented 3 years ago

Hi Manjit, would you be able to provide the full traceback of the error? It's difficult to reproduce and identify the problem based solely on the error message.

EDIT: I was able to reproduce the error and I will be looking into it. Thanks for bringing this to our attention!

haydengunraj commented 3 years ago

@manjitullal , the model.meta seems fine, but you'll need to change the tensor name arguments to train_tf.py. Specifically,

python train_tf.py \
    --weightspath /content/ \
    --metaname model.meta \
    --ckptname model-8485 \
    --trainfile train_COVIDx7A.txt \
    --testfile test_COVIDx7A.txt \
    --label_tensorname "dense_3_target:0" \
    --weights_tensorname "dense_3_sample_weights:0" \
    --logit_tensorname "dense_3/BiasAdd:0"

The default arguments for the tensor names are based on COVIDNet-CXR4-A, and so they may not match other models. We'll be looking into updating the docs to make this more clear, and to make it easier to use different models.

Also note that --ckptname should point to the checkpoint name stem (i.e., model-8485) rather than the checkpoint file.