tensorflow / tpu

Reference models and tools for Cloud TPUs.
https://cloud.google.com/tpu/
Apache License 2.0
5.2k stars 1.77k forks source link

Unimplemented: DNN library is not found #970

Closed PallawiSinghal closed 2 years ago

PallawiSinghal commented 2 years ago

I am trying to train efficientnet-b7 with instance segmentation, I am getting this error. Node: 'efficientnet-b7/model/stem/conv2d/Conv2D' 2 root error(s) found. (0) UNIMPLEMENTED: DNN library is not found. [[{{node efficientnet-b7/model/stem/conv2d/Conv2D}}]] [[stack/_12437]] (1) UNIMPLEMENTED: DNN library is not found. [[{{node efficientnet-b7/model/stem/conv2d/Conv2D}}]] 0 successful operations. 0 derived errors ignored.

Ran the below command to trigger the training MODEL_DIR="/tpu/models/official/detection/weights/" TRAIN_FILE_PATTERN="/tpu/tools/datasets/data/vexcel/tfrecords/train/train-" EVAL_FILE_PATTERN="/tpu/tools/datasets/data/vexcel/tfrecords/validation/val-" VAL_JSON_FILE="/tpu/tools/datasets/data/vexcel/instances_val2017.json" RESNET_CHECKPOINT="/tpu/models/cascade_maskrcnn_effb7_1280/model.ckpt-180000" nohup python3 /tpu/models/official/detection/main.py \ --model_dir="${MODEL_DIR?}" \ --mode=train \ --eval_after_training=True \ --use_tpu=False \ --config_file="/tpu/models/official/detection/projects/copy_paste/configs/cascade_maskrcnn_effb7_1280.yaml" \ --include_mask \ --params_override="{ train: { checkpoint: { path: ${RESNET_CHECKPOINT?} }, train_file_pattern: ${TRAIN_FILE_PATTERN?} }, eval: { val_json_file: ${VAL_JSON_FILE?}, eval_file_pattern: ${EVAL_FILE_PATTERN?} } }" > train1.log &

I need help, I am trying to run this for the past 5 days.

PallawiSinghal commented 2 years ago

install tensorflow 2.7.0 and solved