Closed nickponline closed 10 months ago
@NielsRogge are we missing something here?
Unfortunately not - I tried a couple of things, but it never worked :(
I've tried to get this working starting with the same image and mask:
First I tried:
image = Image.open('image.jpg').convert('RGB')
mask = Image.open('mask.png').convert('L')
processor = AutoProcessor.from_pretrained("shi-labs/oneformer_coco_swin_large")
semantic_inputs = processor(images=image, segmentation_maps=mask, task_inputs=["semantic"], return_tensors="pt")
processor.tokenizer.batch_decode(semantic_inputs.task_inputs)
model = AutoModelForUniversalSegmentation.from_pretrained("shi-labs/oneformer_coco_swin_large")
with torch.no_grad():
outputs = model(**semantic_inputs)
semantic_segmentation = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
Gives error:
texts = ["a semantic photo"] * self.num_text
TypeError: can't multiply sequence by non-int of type 'NoneType'
So I tried:
image = Image.open('image.jpg').convert('RGB')
mask = Image.open('mask.png').convert('L')
processor = AutoProcessor.from_pretrained("shi-labs/oneformer_coco_swin_large", num_text=1)
semantic_inputs = processor(images=image, segmentation_maps=mask, task_inputs=["semantic"], return_tensors="pt")
processor.tokenizer.batch_decode(semantic_inputs.task_inputs)
model = AutoModelForUniversalSegmentation.from_pretrained("shi-labs/oneformer_coco_swin_large")
with torch.no_grad():
outputs = model(**semantic_inputs)
semantic_segmentation = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
Error is then:
text_queries = nn.functional.normalize(text_queries.flatten(1), dim=-1)
AttributeError: 'NoneType' object has no attribute 'flatten'
@NielsRogge @praeclarumjj3 does this help? Perhaps I'm missing something from the docs?
A notebook has now been uploaded! https://github.com/NielsRogge/Transformers-Tutorials/blob/master/OneFormer/Fine_tune_OneFormer_for_semantic_segmentation.ipynb.
Thanks for pinging me on this
@NielsRogge - thanks a lot for providing a tutorial! Will try ASAP!
This seems to work, although doesn't seen like you can calculate the loss if is_training=False
.
Is there a way to calculate it for example for validation loss?
Actually nevermind you can, validation is still training :) Can close!
The process for fine-tuning Oneformer seems different to MaskFormer and Mask2Former. No matter what I try I can't seem to get the model to work. Here's an example, which I feel should work for semantic segmentation:
Which gives crashes with following output:
@werner-rammer did you have any success?