CVMI-Lab / PLA

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
Apache License 2.0
261 stars 11 forks source link

Reproducing scannet200 zeroshot #46

Closed Outlying3720 closed 4 months ago

Outlying3720 commented 5 months ago

Hi! I tried to process scannet200 by myself. And eval with provided ckpt. However, I got following result: B170 hIoU/mIoU/IoU_base/IoU_novel: 16.89 / 20.28 / 21.48 / 13.91

B150 hIoU/mIoU/IoU_base/IoU_novel: 14.32 / 19.17 / 22.28 / 10.55

zeroshot 2024-06-20 10:57:30,169 INFO mIoU: 6.32 2024-06-20 10:57:30,169 INFO mAcc: 16.05

zeroshot+openscene 2024-06-20 11:06:15,668 INFO mIoU: 7.76 2024-06-20 11:06:15,668 INFO mAcc: 15.44

which is strange, B150 and B170 eval results are normal. But zeroshot results are dropped a lot.

I follow #42 and change .tsv and class remapper to generate 200 pth. And I try to use ScanNet official processed code to generate 200 pth. (I found the different is official code aligned pointcloud axis)

Could you please share your process script or give me more details to process scannet200? Thanks a lot!

jihanyang commented 5 months ago

I will check this and reach out a few days later. Please stay tuned.

jihanyang commented 4 months ago

Hello @Outlying3720, thanks for your information. We find that this is because of the missed ignore_idx for the scannet200 zero-shot setting. Since we report foreground mIoU and mAcc by excluding background categories in the paper, we should add these categories to ignore_idx.

We have fixed this problem in the new commit. Please check it out!

Outlying3720 commented 4 months ago

raise a little but still low, maybe I preprocessed in the wrong way. Looking forward to sharing your process script.

zeroshot 2024-07-01 14:22:22,308 INFO mIoU: 7.69 2024-07-01 14:22:22,308 INFO mAcc: 17.28

zeroshot+openscene 2024-07-01 14:26:17,350 INFO mIoU: 8.13 2024-07-01 14:26:17,350 INFO mAcc: 17.68

jihanyang commented 4 months ago

can you show me your command? i have just reproduced it on my local machine.

Outlying3720 commented 4 months ago

OK, I use ScanNet official code to generate preprocessed 200 datasets with following modification to match PointGroup's pipeline:

import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

import sys
import os
import argparse
import glob
import json
from concurrent.futures import ProcessPoolExecutor
from itertools import repeat

# Load external constants
from scannet200_constants import *
from scannet200_splits import *
from utils import *

CLOUD_FILE_PFIX = '_vh_clean_2'
SEGMENTS_FILE_PFIX = '.0.010000.segs.json'
AGGREGATIONS_FILE_PFIX = '.aggregation.json'
CLASS_IDs = VALID_CLASS_IDS_200

# Map relevant classes to {0,1,...,19}, and ignored classes to -100
remapper = np.ones(1200) * (-100)
for i, x in enumerate(VALID_CLASS_IDS_200):
    remapper[x] = i
remapper = remapper.astype(int)

import torch

def handle_process(scene_path, output_path, labels_pd, train_scenes, val_scenes):

    scene_id = scene_path.split('/')[-1]
    mesh_path = os.path.join(scene_path, f'{scene_id}{CLOUD_FILE_PFIX}.ply')
    segments_file = os.path.join(scene_path, f'{scene_id}{CLOUD_FILE_PFIX}{SEGMENTS_FILE_PFIX}')
    aggregations_file = os.path.join(scene_path, f'{scene_id}{AGGREGATIONS_FILE_PFIX}')
    info_file = os.path.join(scene_path, f'{scene_id}.txt')

    if scene_id in train_scenes:
        output_file = os.path.join(output_path, 'train', f'{scene_id}.ply')
        split_name = 'train'
    elif scene_id in val_scenes:
        output_file = os.path.join(output_path, 'val', f'{scene_id}.ply')
        split_name = 'val'
    else:
        output_file = os.path.join(output_path, 'test', f'{scene_id}.ply')
        split_name = 'test'

    print('Processing: ', scene_id, 'in ', split_name)

    # Rotating the mesh to axis aligned
    info_dict = {}
    with open(info_file) as f:
        for line in f:
            (key, val) = line.split(" = ")
            info_dict[key] = np.fromstring(val, sep=' ')

    if 'axisAlignment' not in info_dict:
        rot_matrix = np.identity(4)
    else:
        rot_matrix = info_dict['axisAlignment'].reshape(4, 4)

    pointcloud, faces_array = read_plymesh(mesh_path)
    points = pointcloud[:, :3]
    colors = pointcloud[:, 3:6]
    alphas = pointcloud[:, -1]

    # points = np.array([list(x) for x in f.elements[0]])
    coords = np.ascontiguousarray(points - points.mean(0))
    colors = np.ascontiguousarray(colors) / 127.5 - 1
    pointcloud = np.append(coords, colors, axis=1)

    # Rotate PC to axis aligned
    # r_points = pointcloud[:, :3].transpose()
    # r_points = np.append(r_points, np.ones((1, r_points.shape[1])), axis=0)
    # r_points = np.dot(rot_matrix, r_points)
    # pointcloud = np.append(r_points.transpose()[:, :3], pointcloud[:, 3:], axis=1)

    # Load segments file
    with open(segments_file) as f:
        segments = json.load(f)
        seg_indices = np.array(segments['segIndices'])

    # Load Aggregations file
    with open(aggregations_file) as f:
        aggregation = json.load(f)
        seg_groups = np.array(aggregation['segGroups'])

    # Generate new labels
    labelled_pc = np.zeros((pointcloud.shape[0], 1))
    instance_ids = np.zeros((pointcloud.shape[0], 1))
    for group in seg_groups:
        segment_points, p_inds, label_id = point_indices_from_group(pointcloud, seg_indices, group, labels_pd, CLASS_IDs)

        labelled_pc[p_inds] = label_id
        instance_ids[p_inds] = group['id']

    labelled_pc = labelled_pc.astype(int)
    instance_ids = instance_ids.astype(int)

    # SAVE FOR REGIONPLC
    points = pointcloud[:, :3]
    sem_labels = remapper[labelled_pc].squeeze()
    instance_labels = instance_ids.squeeze()
    save_path = output_file.replace(".ply", ".pth")

    torch.save((points, colors, sem_labels, instance_labels), save_path)

    # # Concatenate with original cloud
    # processed_vertices = np.hstack((pointcloud[:, :6], labelled_pc, instance_ids))

    # if (np.any(np.isnan(processed_vertices)) or not np.all(np.isfinite(processed_vertices))):
    #     raise ValueError('nan')

    # # Save processed mesh
    # save_plymesh(processed_vertices, faces_array, output_file, with_label=True, verbose=False)

    # Uncomment the following lines if saving the output in voxelized point cloud
    # quantized_points, quantized_scene_colors, quantized_labels, quantized_instances = voxelize_pointcloud(points, colors, labelled_pc, instance_ids, faces_array)
    # quantized_pc = np.hstack((quantized_points, quantized_scene_colors, quantized_labels, quantized_instances))
    # save_plymesh(quantized_pc, faces=None, filename=output_file, with_label=True, verbose=False)

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--dataset_root', default="/opt/data/common/scannet_v2/scans", help='Path to the ScanNet dataset containing scene folders')
    parser.add_argument('--output_root', default="/root/private/dataset/pla/scannetv2_200/processed", help='Output path where train/val folders will be located')
    parser.add_argument('--label_map_file', default="/root/private/dataset/pla/scannetv2_200/scannetv2-labels.combined.tsv", help='path to scannetv2-labels.combined.tsv')
    parser.add_argument('--num_workers', default=32, type=int, help='The number of parallel workers')
    parser.add_argument('--train_val_splits_path', default='/root/private/dataset/pla/pla_scannetv2', help='Where the txt files with the train/val splits live')
    config = parser.parse_args()

    # Load label map
    labels_pd = pd.read_csv(config.label_map_file, sep='\t', header=0)

    # Load train/val splits
    with open(config.train_val_splits_path + '/scannetv2_train.txt') as train_file:
        train_scenes = train_file.read().splitlines()
    with open(config.train_val_splits_path + '/scannetv2_val.txt') as val_file:
        val_scenes = val_file.read().splitlines()

    # Create output directories
    train_output_dir = os.path.join(config.output_root, 'train')
    if not os.path.exists(train_output_dir):
        os.makedirs(train_output_dir)
    val_output_dir = os.path.join(config.output_root, 'val')
    if not os.path.exists(val_output_dir):
        os.makedirs(val_output_dir)

    # Load scene paths
    scene_paths = sorted(glob.glob(config.dataset_root + '/*'))

    # Preprocess data.
    pool = ProcessPoolExecutor(max_workers=config.num_workers)
    print('Processing scenes...')
    _ = list(pool.map(handle_process, scene_paths, repeat(config.output_root), repeat(labels_pd), repeat(train_scenes), repeat(val_scenes)))

original file could be found in: https://github.com/ScanNet/ScanNet/blob/fcaa1773a9e186b22a4228df632b6999a1633b79/BenchmarkScripts/ScanNet200/preprocess_scannet200.py

jihanyang commented 4 months ago

Oh, sorry for the confusion. Actually, I mean your evaluation shell command. Since you can get normal results for B150 and B170 results, I just want to make sure you select the correct model during zero-shot evaluation.

Outlying3720 commented 4 months ago

I see, but I don't think the eval command will have anything wrong.

python3 tools/test.py --cfg /opt/data/private/regionplc/tools/cfgs/scannet200_models/zs/spconv_clip_caption_openscene.yaml --extra_tag zs-author-op --ckpt output/ckpt/200-plc-op-s32.pth

and the ckpt is provided ckpt.

jihanyang commented 4 months ago

Maybe I can try to upload a processed version of scannet200 dataset later. Please stay tuned.

jihanyang commented 4 months ago

Please check here https://huggingface.co/datasets/jihanyang/RegionPLC_ScanNet200/tree/main.