Closed s5248 closed 7 years ago
The annotated frames in the labeled dataset part of NYUDv2 are drawn from videos. The full dataset has the rest of the video frames without annotations. Here is the data as it was prepared in the FCN paper to train the NYUD segmentation model http://dl.caffe.berkeleyvision.org/nyud.tar.gz
The annotated frames in the labeled dataset part of NYUDv2 are drawn from videos. The full dataset has the rest of the video frames without annotations. Here is the data as it was prepared in the FCN paper to train the NYUD segmentation model http://dl.caffe.berkeleyvision.org/nyud.tar.gz
Thanks a lot for the link to the tar file. I had the toolbox but I am new to this. I didn't know to extract images out of it. But the zip file helped.
In this paper, "Clockwork Convnets for Video Semantic Segmentation", The NYUDv2 dataset [6] collects short RGB-D clips and includes a segmentation benchmark with high-quality but temporally sparse pixel annotations (every tenth video frame is labeled). We run on video from the “raw” clips subsampled 10X and evaluate on every labeled frame. But i am confused how to get the every 10th labeled ground truth, maybe I have to annotate them by myself, because I have not found the download link in http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.