tensorflow / tpu

Reference models and tools for Cloud TPUs.
https://cloud.google.com/tpu/
Apache License 2.0
5.21k stars 1.77k forks source link

Why "train and eval" needs annotations for validation set in mask rcnn? #477

Open panfeng-hover opened 5 years ago

panfeng-hover commented 5 years ago

In the tutorial https://cloud.google.com/tpu/docs/tutorials/mask-rcnn

cd /usr/share/ && python tpu/models/official/mask_rcnn/mask_rcnn_main.py \
--use_tpu=True \
--tpu=${TPU_NAME:?} \
--model_dir=${GCS_MODEL_DIR:?} \
--num_cores=8 \
--mode="train_and_eval" \
--config_file="/usr/share/tpu/models/official/mask_rcnn/configs/cloud/${ACCELERATOR_TYPE}.yaml" \
--params_override="checkpoint=${CHECKPOINT},training_file_pattern=${PATH_GCS_MASKRCNN:?}/train-*,validation_file_pattern=${PATH_GCS_MASKRCNN:?}/val-*,val_json_file=${PATH_GCS_MASKRCNN:?}/instances_val2017.json"

It needs val_json_file=${PATH_GCS_MASKRCNN:?}/instances_val2017.json as input. However, the annotations already exist in tf record files and should not be duplicated as another input.

By the way, a lot of coco stuff messed up with the whole mask rcnn pipeline, for example, coco_utils.py and coco_metric.py, which should be removed and otherwise the whole model sounds like to make coco training work but not for general datasets.

artyompal commented 5 years ago

My understanding is, the validation code does not parse tfrecord files, so it needs this duplicated information in a more readable format.