huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.43k stars 26.88k forks source link

Oneformer throws exception for when training for instance segmentation #27572

Closed nickponline closed 8 months ago

nickponline commented 11 months ago

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

Who can help?

@amyeroberts @NielsRogge @praeclarumjj3

Information

Tasks

Reproduction

inputs.zip

import numpy as np
from transformers import AutoProcessor, AutoModelForUniversalSegmentation

id2label = {
        0 : "background",
        1 : "triangle",
        2 : "circle",
        3 : "rectangle",
}

preprocessor = AutoProcessor.from_pretrained("shi-labs/oneformer_cityscapes_swin_large", do_resize=True, do_normalize=True, size=dict(width=500, height=500))
model = AutoModelForUniversalSegmentation.from_pretrained("shi-labs/oneformer_cityscapes_swin_large", is_training=True, id2label=id2label, ignore_mismatched_sizes=True)
preprocessor.image_processor.num_text = model.config.num_queries - model.config.text_encoder_n_ctx

image = np.load("image.npy", allow_pickle=True)
instance_seg = np.load("instance_seg.npy", allow_pickle=True) 

inst2class = {0: 0, 3: 1, 4: 1, 6: 1, 9: 1, 10: 1, 11: 1, 13: 1, 16: 1, 17: 1, 18: 1, 20: 1, 21: 1, 22: 1, 23: 1, 24: 1, 26: 1, 28: 1, 30: 1, 35: 1, 36: 1, 39: 1, 2: 2, 5: 2, 8: 2, 12: 2, 15: 2, 19: 2, 25: 2, 27: 2, 31: 2, 32: 2, 34: 2, 37: 2, 38: 2, 1: 3, 14: 3, 33: 3, 40: 3}

inputs = preprocessor(image, segmentation_maps=[instance_seg], instance_id_to_semantic_id=inst2class, task_inputs=["instance"], return_tensors="pt")
    inputs = preprocessor(image, segmentation_maps=[instance_seg], instance_id_to_semantic_id=inst2class, task_inputs=["instance"], return_tensors="pt")
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/processing_oneformer.py", line 119, in __call__
    encoded_inputs = self.image_processor(images, task_inputs, segmentation_maps, **kwargs)
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/image_processing_oneformer.py", line 535, in __call__
    return self.preprocess(images, task_inputs=task_inputs, segmentation_maps=segmentation_maps, **kwargs)
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/image_processing_oneformer.py", line 738, in preprocess
    encoded_inputs = self.encode_inputs(
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/image_processing_oneformer.py", line 1051, in encode_inputs
    masks = np.concatenate(masks, axis=0)
  File "<__array_function__ internals>", line 180, in concatenate

Expected behavior

No exception.

nickponline commented 11 months ago

@amyeroberts is the issue potentially: preprocessor.image_processor.num_text = model.config.num_queries - model.config.text_encoder_n_ctx line? Does that not work for instance segmentation?

Additionally here: https://github.com/NielsRogge/Transformers-Tutorials/issues/370

nickponline commented 10 months ago

It's still an issue, forward pass for instance segmentation using Oneformer.

amyeroberts commented 10 months ago

Hi @nickponline, thanks for raising this issue!

In the example provided, the error is occurring because none of the objects in the image correspond to a "thing" as defined in the metadata.

So, when preparing the inputs to the model, all of the masks are filtered out in this check here. The class_ids of the image being passed in don't correspond to the model's mapping.

Although this behaviour is expected - it does highlight a general difficulty of using this model, and is an issue that's been raised in the past. We should be able to load in alternative (local or repo) metadata paths and load those in. I've opened a PR to address this - #28398

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.