lambert-x / ProLab

Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties"
Apache License 2.0
47 stars 0 forks source link

The ../detection/ops file and make.sh do not exist #1

Open 3202336152 opened 5 months ago

3202336152 commented 5 months ago

Hello, thank you for your excellent work. Where is the file ../detection/ops here? Execution fails here. ln -s ../detection/ops ./ cd ops & sh make.sh

zzzqzhou commented 5 months ago

Hi, Thanks for reaching out. I have just added ./ops in our repo, you can now follow the installation and run the code.

3202336152 commented 5 months ago

Hello, thank you for your reply. After compiling, ops still reports the following error. How to solve it?

Traceback (most recent call last): File "/home/sda/lyf/ProLab-main/test.py", line 11, in import mmseg_custom # noqa: F401,F403 File "/home/sda/lyf/ProLab-main/mmseg_custom/init.py", line 3, in from .models import # noqa: F401,F403 File "/home/sda/lyf/ProLab-main/mmseg_custom/models/init.py", line 2, in from .backbones import # noqa: F401,F403 File "/home/sda/lyf/ProLab-main/mmseg_custom/models/backbones/init.py", line 2, in from .beit_adapter import BEiTAdapter File "/home/sda/lyf/ProLab-main/mmseg_custom/models/backbones/beit_adapter.py", line 9, in from mmseg.models.builder import BACKBONES File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmseg/models/init.py", line 2, in from .backbones import * # noqa: F401,F403 File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmseg/models/backbones/init.py", line 2, in from .beit import BEiT File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmseg/models/backbones/beit.py", line 19, in from .vit import TransformerEncoderLayer as VisionTransformerEncoderLayer File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmseg/models/backbones/vit.py", line 9, in from mmcv.cnn.bricks.transformer import FFN, MultiheadAttention File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmcv/cnn/bricks/transformer.py", line 22, in from mmcv.ops.multi_scale_deform_attn import \ File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmcv/ops/init.py", line 31, in from .iou3d import boxes_iou_bev, nms_bev, nms_normal_bev File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmcv/ops/iou3d.py", line 6, in ext_module = ext_loader.load_ext('_ext', [ File "/home/ys/anaconda3/envs/prolab/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 15, in load_ext assert hasattr(ext, fun), f'{fun} miss in module {name}' AssertionError: iou3d_boxes_iou_bev_forward miss in module _ext

zzzqzhou commented 5 months ago

Did you follow the installation instruction to create a conda environment and install all requirements then compiling the ops? I have tested all the steps on our server and it works. You may also take a look at this similar issue https://github.com/open-mmlab/mmrazor/issues/487 and check if you have installed both mmcv and mmcv-full in the same environment.

3202336152 commented 5 months ago

Hello, thank you for your reply. I have solved the above problem and successfully completed the test. I would like to ask if your code provides visual results, such as images after successful segmentation.

zzzqzhou commented 5 months ago

Hi, if you want the visual results, you can add "--show-dir RESULT_DIR" in evaluation scrpit to generate visual results under RESULT_DIR fold.

3202336152 commented 5 months ago

Thank you for your enthusiastic reply. I would like to ask if it is possible to segment the corresponding image by giving a text description and picture as shown in the example in the paper. Is there such a code implementation?

zzzqzhou commented 5 months ago

Hi, We have no plan to release this part of code right now, sorry about that. But, in the future, we may plan to release some demos. Also, you can implement this by yourself. You can input your language prompts/descriptions into a language embedding model (e.g., bge-base) to get language embeddings and calculate cosine similarity between image embeddings (remember to resize to original image size): https://github.com/lambert-x/ProLab/blob/aca12d7e597e1785829adb55576f3e754d1cf70c/mmseg_custom/models/segmentors/encoder_decoder_cluster_embed.py#L120 with language embeddings to get a soft prediction. Then you can use a threshold (usually 0.5, you may need to slightly adjust it) to get the binary segmentation map. For better segmentation, I recommend you to use descriptions/prompts from our provided descriptors.

3202336152 commented 5 months ago

thank you for your reply. I'll try my best to implement your method and look forward to you posting relevant demo code.