Open linhaojia13 opened 8 months ago
The Scan2Cap task requires localizating instances of categories beyond the pre-defined 18 categories used in 3D detection.
To be specific, the categories in the origin VoteNet implementation are:
{
'cabinet':0, 'bed':1, 'chair':2, 'sofa':3, 'table':4, 'door':5,
'window':6,'bookshelf':7,'picture':8, 'counter':9, 'desk':10, 'curtain':11,
'refrigerator':12, 'showercurtrain':13, 'toilet':14, 'sink':15, 'bathtub':16, 'garbagebin':17
}
However, both scanrefer and nr3d contain annotations on "shoes", "monitors", "tvs" that are common in those 3D environments.
We follow the same category definition as Scan2Cap.