isl-org / ZoeDepth

Metric depth estimation from a single image
MIT License
2.34k stars 214 forks source link

What different actually between ZoeD_N, ZoeD_K and ZoeD_NK? #77

Closed XinyueZ closed 1 year ago

XinyueZ commented 1 year ago

Hey guys,

correct me if I was wrong. I cannot find briefing in the paper regarding this.

bg

42nick commented 1 year ago

Have a look at section 4.2 of the paper. It states the following:

The models are named according to the following convention: ZoeD-{RDPT}-{MFT}, where ZoeD is the abbreviation for ZoeDepth, RDPT denotes the datasets used for relative depth pre-training (X denotes no pre-training) and MFT denotes the datasets used for metric depth finetuning. We train and evaluate the following models: ZoeD-X-N, ZoeD-X-K, ZoeD-M12-N, ZoeD-M12-K and ZoeD-M12-NK. All models use the BEiT384-L backbone from timm [44] that was pre-trained on ImageNet. The models ZoeD-X-N and ZoeD-X-K are directly fine-tuned for metric depth on NYU Depth v2 and KITTI respectively without any pre-training for relative depth estimation. ZoeD-M12-N and ZoeD-M12-K additionally include pre-training for relative depth estimation on the M12 dataset mix before the fine-tuning stage for metric depth. ZoeD-M12-NK is also pre-trained on M12, but has two separate heads fine-tuned on both NYU Depth v2 and KITTI. ZoeD-M12-NK† is a variant of this model with a single head but otherwise the same pre-training and fine-tuning procedure. In the supplement, we provide further results for models trained on additional dataset combinations in pre-training and fine-tuning.

In a nutshell:

As all models share the same weights I would say that ZoeD-M12-NK is meant with "NK" in the code (all models have the same weights).