fudan-zvg / Semantic-Segment-Anything

Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Apache License 2.0
2.16k stars 138 forks source link

Don't get any output after inference #12

Closed AvivSham closed 1 year ago

AvivSham commented 1 year ago

Hi All, Thank you for your amazing work and repo! I'm trying to inference the open vocab model by some random image. I followed the installation instructions and completed them without any errors, then I tried to inference the model as explained in the README file (see the attached photo). The inference was completed without errors just warnings, but when I entered the output path provided when calling main.py it was empty. What am I doing wrong? image

Cheers,

Jiaqi-Chen-00 commented 1 year ago

This situation may occur when the data is not stored in the correct format, and the for loop exits before it even starts. Please set a breakpoint in the code (you can add "import ipdb; ipdb.set_trace()") to check whether the data is being read correctly. If this is indeed the problem, please strictly follow the file format in the Quick start and run the code again.

AvivSham commented 1 year ago

not sure I follow, let me explain my use case in detail. I have a folder containing multiple images without gt masks e.g. img1.png, img2.png, img3.png etc. I want to inference each image using the open vocab model and get a corresponding segmentation map and the output json file described in the README. Do you support this use case? if you do what are the steps I need to follow.

Jiaqi-Chen-00 commented 1 year ago

In this scenario, you need to first use SAM (https://github.com/facebookresearch/segment-anything) to generate a JSON and follow the file structure outlined in the Quick Start guide to run the Open Vocabulary labeling engine. To simplify your work, I have extracted some reference code from the SSA project. It may be helpful.

# the following code is in scripts/main_ssa.py file
sam = sam_model_registry["vit_h"](checkpoint=args.ckpt_path).to(rank)
mask_branch_model = SamAutomaticMaskGenerator(
    model=sam,
    points_per_side=64,
    pred_iou_thresh=0.86,
    stability_score_thresh=0.92,
    crop_n_layers=1,
    crop_n_points_downscale_factor=2,
    min_mask_region_area=100,  # Requires open-cv to run post-processing
    output_mode='coco_rle',
)
# modify this to load and pre-process your own data
img = img_load(args.data_dir, file_name, args.dataset)

# this code is in scripts/pipeline.py
anns = {'annotations': mask_branch_model.generate(img)}
mmcv.dump(anns, os.path.join(output_path, filename + '_semantic.json'))
AvivSham commented 1 year ago

so the flow for generating semantic maps is first to generate JSON file per image using SA and then using this repo? I followed the SA repo and ran the amg.py file, for each image 100 png images were created i.e. 0.png, 1.png, ..., 99.png and metadata.csv file. I did not see a JSON file in the output dir. I think it's worth making the README clearer and creating an e2e pipeline for inferencing instead of jumping between the two repos. WDYT?

AvivSham commented 1 year ago

Ok so for getting the JSON file you need to pass --convert-to-rle flag when running the amg.py file in SA.

deepbrainme commented 1 year ago

Out of memory in my single GPU(24GB) when I run the main_ssa_engine.py. Would you help me figure out this bug? Thanks a lot.

AvivSham commented 1 year ago

Hi @deepbrainme, I don't think it's a bug since running these large models in the same pipeline require lots of gpu memory, but I will let @CJQ-CS-FD answer this one. What is the minimum gpu memory required?

vankeer commented 1 year ago

Out of memory in my single GPU(24GB) when I run the main_ssa_engine.py. Would you help me figure out this bug? Thanks a lot.

I have the same issue with my RTX 3090, but not with all images. Try resizing your images to 720x720, then run SA, then SSA. Maybe someone more knowledgeable than me can explain why, but for me CUDA runs out of memory with oblong images even though they were smaller, like 720x480.

xiaoachen98 commented 1 year ago

Out of memory in my single GPU(24GB) when I run the main_ssa_engine.py. Would you help me figure out this bug? Thanks a lot.

I have the same issue with my RTX 3090, but not with all images. Try resizing your images to 720x720, then run SA, then SSA. Maybe someone more knowledgeable than me can explain why, but for me CUDA runs out of memory with oblong images even though they were smaller, like 720x480.

+1 I meet the same problem.

AvivSham commented 1 year ago

Hi @vankeer @xiaoachen98 I tried to inference using images ~512x512 on V100 gpu and saw it required ~27GB memory. did you try to resize the image dramatically? like 128x128 just to check if it passes?

Jiaqi-Chen-00 commented 1 year ago

Hello, @AvivSham @vankeer @xiaoachen98, we have updated our code and you can pull the latest version. The small version of SSA-engine requires only 12GB of memory, while the base version takes 14GB. If you wish to use both SSA-engine and SAM simultaneously, you will need larger graphics memory. For more detailed information, please refer to the Memory Usage section in the Readme. We do not recommend reducing the size of the images, as this may result in the loss of semantic information.