Oneformer throws exception for when training for instance segmentation

nickponline commented 11 months ago

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

transformers version: 4.35.0.dev0
Platform: macOS-13.4-arm64-arm-64bit
Python version: 3.10.6
Huggingface_hub version: 0.16.4
Safetensors version: 0.3.2
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.1.0 (False)
Tensorflow version (GPU?): 2.13.0 (False)
Flax version (CPU?/GPU?/TPU?): 0.7.2 (cpu)
Jax version: 0.4.14
JaxLib version: 0.4.14
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

@amyeroberts @NielsRogge @praeclarumjj3

Information

[x] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[x] My own task or dataset (give details below)

Reproduction

inputs.zip

import numpy as np
from transformers import AutoProcessor, AutoModelForUniversalSegmentation

id2label = {
        0 : "background",
        1 : "triangle",
        2 : "circle",
        3 : "rectangle",
}

preprocessor = AutoProcessor.from_pretrained("shi-labs/oneformer_cityscapes_swin_large", do_resize=True, do_normalize=True, size=dict(width=500, height=500))
model = AutoModelForUniversalSegmentation.from_pretrained("shi-labs/oneformer_cityscapes_swin_large", is_training=True, id2label=id2label, ignore_mismatched_sizes=True)
preprocessor.image_processor.num_text = model.config.num_queries - model.config.text_encoder_n_ctx

image = np.load("image.npy", allow_pickle=True)
instance_seg = np.load("instance_seg.npy", allow_pickle=True) 

inst2class = {0: 0, 3: 1, 4: 1, 6: 1, 9: 1, 10: 1, 11: 1, 13: 1, 16: 1, 17: 1, 18: 1, 20: 1, 21: 1, 22: 1, 23: 1, 24: 1, 26: 1, 28: 1, 30: 1, 35: 1, 36: 1, 39: 1, 2: 2, 5: 2, 8: 2, 12: 2, 15: 2, 19: 2, 25: 2, 27: 2, 31: 2, 32: 2, 34: 2, 37: 2, 38: 2, 1: 3, 14: 3, 33: 3, 40: 3}

inputs = preprocessor(image, segmentation_maps=[instance_seg], instance_id_to_semantic_id=inst2class, task_inputs=["instance"], return_tensors="pt")

    inputs = preprocessor(image, segmentation_maps=[instance_seg], instance_id_to_semantic_id=inst2class, task_inputs=["instance"], return_tensors="pt")
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/processing_oneformer.py", line 119, in __call__
    encoded_inputs = self.image_processor(images, task_inputs, segmentation_maps, **kwargs)
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/image_processing_oneformer.py", line 535, in __call__
    return self.preprocess(images, task_inputs=task_inputs, segmentation_maps=segmentation_maps, **kwargs)
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/image_processing_oneformer.py", line 738, in preprocess
    encoded_inputs = self.encode_inputs(
  File "/opt/anaconda3/envs/dev/lib/python3.10/site-packages/transformers/models/oneformer/image_processing_oneformer.py", line 1051, in encode_inputs
    masks = np.concatenate(masks, axis=0)
  File "<__array_function__ internals>", line 180, in concatenate

Expected behavior

No exception.

nickponline commented 11 months ago

@amyeroberts is the issue potentially: preprocessor.image_processor.num_text = model.config.num_queries - model.config.text_encoder_n_ctx line? Does that not work for instance segmentation?

Additionally here: https://github.com/NielsRogge/Transformers-Tutorials/issues/370

nickponline commented 10 months ago

It's still an issue, forward pass for instance segmentation using Oneformer.

amyeroberts commented 10 months ago

Hi @nickponline, thanks for raising this issue!

In the example provided, the error is occurring because none of the objects in the image correspond to a "thing" as defined in the metadata.

So, when preparing the inputs to the model, all of the masks are filtered out in this check here. The class_ids of the image being passed in don't correspond to the model's mapping.

Although this behaviour is expected - it does highlight a general difficulty of using this model, and is an issue that's been raised in the past. We should be able to load in alternative (local or repo) metadata paths and load those in. I've opened a PR to address this - #28398

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / transformers