Closed KaKa-101 closed 4 months ago
run_prepare.sh
refers to preprocess/run_prepare.sh
. Thanks for pointing out the link problem.
The version
is set to null as default. It's right.
You can find scripts/inference.sh
in this repo which was forked from Uni3D.
- The
run_prepare.sh
refers topreprocess/run_prepare.sh
. Thanks for pointing out the link problem. Theversion
is set to null as default. It's right. You can findscripts/inference.sh
in this repo which was forked from Uni3D.- It's a little tricky to install open3d package. This package is only used in their visualization code. So we skipped installing this package and directly run the inference code.
Thanks for your reply.
Process data
after getting every instance segmentation of each scene using pretrained Mask3D model?
bash preprocess/run_prepare.sh
using mask3d_inst_seg
you have provided, I met the following error:
Process data
is prepare QA pairs for each task/dataset, and transform the GT IDs to correponding segmented IDs (based on IoU between GT instances and segmented instances)Hi @KaKa-101, we've updated the preprocessing codes to only rely on official annotations of each dataset.
Thanks for your great work. But when I prepare the environment for uni3d, when I run pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
as guided, I got the following error:
Do you know how to fix it? I'd appreciate it very much.
I just now followed the instruction of uni3d to install the environment and got the same error as yours.
Then I find the problem is related to the pytorch version. You would install pytorch 2.3.0 using his scripts, which is probably uncompatible with pointnet2_ops for now. I switch it to pytorch 2.0.1 (which is the version we used before), then it works.
I recommend you trying this:
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia -y
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
Thanks for your advice and it works.
But I don't get the desired segment_result_dir
using pretrained Mask3D to do instance segmentation. First, I preprocess the datasets like this (--scannet200 is setted to False) guided in Mask3D:
Then I conduct validation in ScanNet like this:
But in the eval_output
folder, there is nothing. There are also only logs in saved
folder:
So do you know how should I change the settings or anything to get the instance segmentation results? Could you share what your scannet_val.sh
looks like? Besides I want to know why you choose scannet200_val.pt
but not scannet.pt
as pretrained checkpoint since we are conducting experiments on ScanNet dataset. Thanks very much for your help and I'd appreciate it a lot.
Also, could you show me what the folder segment_result_dir
(the whole prediceted results using pretrained Mask3D checkpoint) looks like?
This is the script I used:
#!/bin/bash
export OMP_NUM_THREADS=3 # speeds up MinkowskiEngine
NUM_GPUS=1
CURR_DBSCAN=0.95
CURR_TOPK=750
CURR_QUERY=100
# TEST
python main_instance_segmentation.py \
general.gpus=${NUM_GPUS} \
general.experiment_name="test0416_scannet200" \
general.project_name="test0416" \
general.checkpoint='checkpoints/scannet200/scannet200_val.ckpt' \
data/datasets=scannet200 \
general.num_targets=201 \
data.num_labels=200 \
general.train_mode=false \
general.eval_on_segments=true \
general.train_on_segments=true \
model.num_queries=${CURR_QUERY} \
general.topk_per_image=${CURR_TOPK} \
general.use_dbscan=true \
general.dbscan_eps=${CURR_DBSCAN} \
general.save_visualizations=false \
general.export=true
Basically, you need to set general.export=true
to let the model export predicted masks. The eval_output
folder will look like this:
And with the default setting, you can only get the result of val split. To get the result of train split, I directly changed this line to self._load_yaml(database_path / f"train_database.yaml")
ScanNet and ScanNet200 share the same scene data. The only difference between them is the category annotation (the latter one has 200 classes). In our experiment, we find that the model trained on ScanNet200 generates a better instance segmentation result. Plus we do not need the prediceted class labels for our model, so using scannet200_val.pt
should be ok.
Thanks for your quick reply, which helps me a lot.
And I want to know if I use scannet200_val.pt
as pretrained_weight, do I need to set --scannet200
to True in the step of preprocessing dataset:
Yes, you need to set --scannet200=true
.
Thanks a lot.
Besides, in the paper you mention that you designed a relation module to incorporate spatial information into a relation-aware token for each object in the scene-level alignment. Could you explain this part in detail and which part of the code you provided corresponds to the relation module
?
We proposed the relation module to get a scene-aware token for each object here.
However, we've discarded it in v2.1 since we find these scene-aware tokens do not help improve the performance. We are still exploring how to incorporate position information into the model in a better way.
Thanks for your reply.
scannet_train/val_attributes.pt
and scannet_mask3d_train/val_attributes.pt
? scannet_train/val_attributes.pt
used to provided spatial information between objects in the scene to LLM and scannet_mask3d_train/val_attributes.pt
used to provided each object's attributes(class label, coordinates, segments and so on) to LLM?scannet_train/val_attributes.pt
saves the location (3D center and size, which can be transformed to 3D bounding box) and class label of each GT instance. While scannet_mask3d_train/val_attributes.pt
saves the location and class label of each segmented instance (from Mask3D).scannet_train/val_attributes.pt
is only used for evaluation (calculate the IoU between GT instance's bbox and segmented instance's bbox). We only input the segmented instances to the model (both for training and evaluating).Thanks for your reply, which helps a lot.
And I find that there is a file named scan2cap_val_corpus.json
in your provided annotations, but it seemed I can't generate this file using provided preprocessing code. So what's the function of this file and do I need it for subsequent training and inference?
It's used in Scan2Cap evaluation. We reference Scan2Cap's code to prepare this corpus. Basically, it converts all the original descriptions into "sos {description} eos".
This is the script I used:
#!/bin/bash export OMP_NUM_THREADS=3 # speeds up MinkowskiEngine NUM_GPUS=1 CURR_DBSCAN=0.95 CURR_TOPK=750 CURR_QUERY=100 # TEST python main_instance_segmentation.py \ general.gpus=${NUM_GPUS} \ general.experiment_name="test0416_scannet200" \ general.project_name="test0416" \ general.checkpoint='checkpoints/scannet200/scannet200_val.ckpt' \ data/datasets=scannet200 \ general.num_targets=201 \ data.num_labels=200 \ general.train_mode=false \ general.eval_on_segments=true \ general.train_on_segments=true \ model.num_queries=${CURR_QUERY} \ general.topk_per_image=${CURR_TOPK} \ general.use_dbscan=true \ general.dbscan_eps=${CURR_DBSCAN} \ general.save_visualizations=false \ general.export=true
Basically, you need to set
general.export=true
to let the model export predicted masks. Theeval_output
folder will look like this:And with the default setting, you can only get the result of val split. To get the result of train split, I directly changed this line to
self._load_yaml(database_path / f"train_database.yaml")
ScanNet and ScanNet200 share the same scene data. The only difference between them is the category annotation (the latter one has 200 classes). In our experiment, we find that the model trained on ScanNet200 generates a better instance segmentation result. Plus we do not need the prediceted class labels for our model, so using
scannet200_val.pt
should be ok.
Thanks for your provided scipt of Mask3D. Do you know in which file I can learn or configure all the parameters. For example, I want to know where the setting general.export=true
takes effect. I haven't figured out how these parameters work in the code of Mask3D.
I will appreciate it a lot if you can provide help.
Thanks for your work and when I preprocessed the data guided by readme, I met the following two questions:
inference.sh
yet and I don't know whether other files in preprocess need to be updated or not.