Closed AvivSham closed 1 year ago
This situation may occur when the data is not stored in the correct format, and the for loop exits before it even starts. Please set a breakpoint in the code (you can add "import ipdb; ipdb.set_trace()") to check whether the data is being read correctly. If this is indeed the problem, please strictly follow the file format in the Quick start and run the code again.
not sure I follow, let me explain my use case in detail.
I have a folder containing multiple images without gt masks e.g. img1.png, img2.png, img3.png
etc. I want to inference each image using the open vocab model and get a corresponding segmentation map and the output json file described in the README.
Do you support this use case? if you do what are the steps I need to follow.
In this scenario, you need to first use SAM (https://github.com/facebookresearch/segment-anything) to generate a JSON and follow the file structure outlined in the Quick Start guide to run the Open Vocabulary labeling engine. To simplify your work, I have extracted some reference code from the SSA project. It may be helpful.
# the following code is in scripts/main_ssa.py file
sam = sam_model_registry["vit_h"](checkpoint=args.ckpt_path).to(rank)
mask_branch_model = SamAutomaticMaskGenerator(
model=sam,
points_per_side=64,
pred_iou_thresh=0.86,
stability_score_thresh=0.92,
crop_n_layers=1,
crop_n_points_downscale_factor=2,
min_mask_region_area=100, # Requires open-cv to run post-processing
output_mode='coco_rle',
)
# modify this to load and pre-process your own data
img = img_load(args.data_dir, file_name, args.dataset)
# this code is in scripts/pipeline.py
anns = {'annotations': mask_branch_model.generate(img)}
mmcv.dump(anns, os.path.join(output_path, filename + '_semantic.json'))
so the flow for generating semantic maps is first to generate JSON file per image using SA and then using this repo?
I followed the SA repo and ran the amg.py
file, for each image 100 png images were created i.e. 0.png, 1.png, ..., 99.png
and metadata.csv
file. I did not see a JSON file in the output dir.
I think it's worth making the README clearer and creating an e2e pipeline for inferencing instead of jumping between the two repos.
WDYT?
Ok so for getting the JSON file you need to pass --convert-to-rle
flag when running the amg.py
file in SA.
Out of memory in my single GPU(24GB) when I run the main_ssa_engine.py. Would you help me figure out this bug? Thanks a lot.
Hi @deepbrainme, I don't think it's a bug since running these large models in the same pipeline require lots of gpu memory, but I will let @CJQ-CS-FD answer this one. What is the minimum gpu memory required?
Out of memory in my single GPU(24GB) when I run the main_ssa_engine.py. Would you help me figure out this bug? Thanks a lot.
I have the same issue with my RTX 3090, but not with all images. Try resizing your images to 720x720, then run SA, then SSA. Maybe someone more knowledgeable than me can explain why, but for me CUDA runs out of memory with oblong images even though they were smaller, like 720x480.
Out of memory in my single GPU(24GB) when I run the main_ssa_engine.py. Would you help me figure out this bug? Thanks a lot.
I have the same issue with my RTX 3090, but not with all images. Try resizing your images to 720x720, then run SA, then SSA. Maybe someone more knowledgeable than me can explain why, but for me CUDA runs out of memory with oblong images even though they were smaller, like 720x480.
+1 I meet the same problem.
Hi @vankeer @xiaoachen98 I tried to inference using images ~512x512 on V100 gpu and saw it required ~27GB memory. did you try to resize the image dramatically? like 128x128 just to check if it passes?
Hello, @AvivSham @vankeer @xiaoachen98, we have updated our code and you can pull the latest version. The small version of SSA-engine requires only 12GB of memory, while the base version takes 14GB. If you wish to use both SSA-engine and SAM simultaneously, you will need larger graphics memory. For more detailed information, please refer to the Memory Usage section in the Readme. We do not recommend reducing the size of the images, as this may result in the loss of semantic information.
Hi All, Thank you for your amazing work and repo! I'm trying to inference the open vocab model by some random image. I followed the installation instructions and completed them without any errors, then I tried to inference the model as explained in the README file (see the attached photo). The inference was completed without errors just warnings, but when I entered the output path provided when calling
main.py
it was empty. What am I doing wrong?Cheers,