ViCCo-Group / thingsvision

Python package for extracting representations from state-of-the-art computer vision models
https://vicco-group.github.io/thingsvision/
MIT License
157 stars 21 forks source link

Shape mismatch error while extracting layers from InceptionResNetV2 #126

Closed nsossounov closed 1 year ago

nsossounov commented 1 year ago

Hi there,

Trying to extract layers from Inception-ResNet-V2 (layer 'block17_20_mixed' in this example):

root='./images'
outpath = './outpath'
model_name = 'InceptionResNetV2'
module_name = 'block17_20_mixed'
source = 'keras' # TensorFlow backend
batch_size = 10
class_names = None  # optional list of class names for class dataset
# file_names = /file_names.txt' # optional list of file names according to which features should be sorted
file_names = 'file_names.txt' # optional list of file names according to which features should be sorted
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# initialize extractor module
extractor = get_extractor(
  model_name=model_name,
  pretrained=True,
  model_path=None,
  device=device,
  source=source,
)

dataset = ImageDataset(
  root=root,
  out_path=root,
  backend=extractor.get_backend(),
  transforms=extractor.get_transformations(),
)

batches = DataLoader(
  dataset=dataset,
  batch_size=batch_size,
  backend=extractor.backend,
)

features = extractor.extract_features(
  batches=batches,
  module_name=module_name,
  flatten_acts=False,
)

save_features(features, out_path='./block17_20_mixed/', file_format='npy')

Running the script produces a ValueError:


Batch:   0%|          | 0/400 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/XXXXX/things_vision_extraction.py", line 66, in <module>
    features = extractor.extract_features(
  File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/thingsvision/core/extraction/base.py", line 95, in extract_features
    self._extract_features(
  File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/thingsvision/core/extraction/mixin.py", line 136, in _extract_features
    activations = activation_model.predict(batch)
  File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/var/folders/p0/8zy4v5rd2rx1qkf2mvj0tg3r0000gn/T/__autograph_generated_filepjgbhsp0.py", line 15, in tf__predict_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

    File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/engine/training.py", line 1845, in predict_function  *
        return step_function(self, iterator)
    File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/engine/training.py", line 1834, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/engine/training.py", line 1823, in run_step  **
        outputs = model.predict_step(data)
    File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/engine/training.py", line 1791, in predict_step
        return self(x, training=False)
    File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/Volumes/Samsung_X5/interpreter/lib/python3.9/site-packages/keras/engine/input_spec.py", line 264, in assert_input_compatibility
        raise ValueError(f'Input {input_index} of layer "{layer_name}" is '

    ValueError: Input 0 of layer "model" is incompatible with the layer: expected shape=(None, 299, 299, 3), found shape=(None, 224, 224, 3)

By using print(extractor.show_model()) output shape of 'block17_20_mixed' is (None, 17, 17, 384).

Only getting this problem with Inception-ResNet-V2. I'm also extracting layers from Resnet50, Resnet-152-V2, and VGG-16, haven't ran into this issue for other models. Is this an issue on my end?

Thanks in advance!

nsossounov commented 1 year ago

I am assuming something goes wrong within transforms=extractor.get_transformations(), in dataset = ImageDataset(), and the images are getting resized into the wrong shape. Not sure how I could fix that without creating a custom layer extraction script though.

Alxmrphi commented 1 year ago

@LukasMut - could this be related to the #todo marked in the get_default_transformations of TensorFlowExtractor ? Namely:

    def get_default_transformation(
        self,
        mean: List[float],
        std: List[float],
        resize_dim: int = 256,
        crop_dim: int = 224,
        apply_center_crop: bool = True,
    ) -> Any:
        resize_dim = crop_dim
        composes = [layers.experimental.preprocessing.Resizing(resize_dim, resize_dim)]
        if apply_center_crop:
            pass
            # TODO: fix center crop problem with Keras
            # composes.append(layers.experimental.preprocessing.CenterCrop(crop_dim, crop_dim))**

i.e. the centre crop isn't being applied and causing the mismatch?

@ecktoh - could you perhaps see if the issue works by switching the backend to PyTorch? That should help us locate the problem.

Thanks!

LukasMut commented 1 year ago

@Alxmrphi I don't think so. From what it looks like, it has something to do with the initialization of the first layer of the model rather than with the inputs. The model expects inputs of size 299 x 299 x 3. However, the shape of the input is 224 x 224 x 3, which is what it's supposed to be when using the default transformations. @ecktoh, the images are not getting resized into the wrong shape. Thus, it has nothing to do with the extractor or the dataset class. Transformations are applied correctly.

I see that @ecktoh passes rootto both root and out_path of ImageDataset(...). Could you fix this and see whether the error is still raised?

I've just went through our source code to see whether for some weird reason preprocess could evaluate to true, but this cannot happen if source = keras. Hence, the default transformations are applied. It's weird that it doesn't happen for the other models. @ecktoh, is the source you're using for the other models also keras?

My guess is that the model was pretrained with images of size 299 x 299 x 3 rather than with standard image size of 224 x 224 x 3. If that's the case, then @ecktoh you would have to fork our repo and change both crop and resize dim to 299 in the get_transformations method of the base extractor class until we allow flexible resize_dim and crop_dim arguments.

Alxmrphi commented 1 year ago

I see that @ecktoh passes rootto both root and outpath of ImageDataset. Could fix this and see whether the error is still raised?

Ah, good spot! Yes, it looks like we should see what happens after these changes have been applied.

LukasMut commented 1 year ago

At this point, we should probably think about making resize_dim and crop_dim flexible arguments that a user can change if they want to. Default values can stay the same. I have never seen this error before but it seems to be necessary for some models. @Alxmrphi, could you open this response as a new issue?

nsossounov commented 1 year ago

@LukasMut @Alxmrphi thanks a lot for the quick responses. Passing both root to both root and outpath does not seem to be the cause of the problem, I stopped passing root to outpath and the error persists.

My guess is that the model was pretrained with images 299 x 299 x 3 rather than with standard image size of 224 x 224 x 3.

I've just went through our source code to see whether for some weird reason preprocess could evaluate to true but this cannot happen if source = keras. Hence, the default transformations are applied. It's also weird that it doesn't happen for the other models. @ecktoh, is the source you're using for the other models also keras?

@LukasMut I didn't know that the dimensions of the test images have to match that of the training data. Thanks for the suggestion to fork the repo, I'll look into that. Yes, the source for other models is also keras, not sure why this error only happens when model_name = 'InceptionResNetV2'.

@ecktoh - could you perhaps see if the issue works by switching the backend to PyTorch?

@Alxmrphi so put backend='PyTorch' in DataLoader()? Or change the backend=extractor.get_backend() statement in ImageDataset()? If it's the latter I'm not quite sure exactly what to change it to. I'm quite new to CNNs so my apologies if my questions come off as a bit ignorant.

Alxmrphi commented 1 year ago

@ecktoh - Your questions are absolute fine and welcome. We've all been new at this same thing at one point and that's nothing to apologise for! I suggest just holding off for a moment while we try to figure out an update that solves your problem. We'll post back when we have something for you to try :)

LukasMut commented 1 year ago

@ecktoh, could you try the following,

dataset = ImageDataset(
  root=root, # path/to/input/images
  out_path=out_path,  # path/to/output/features
  backend=extractor.get_backend(),
  transforms=extractor.get_transformations(resize_dim=299, crop_dim=299), # manually specify the input dimensions
  )

and see whether this fixes your problem? Don't change the backend! Keep it as it is.

nsossounov commented 1 year ago

@LukasMut ingenious! That fixed it. I guess I should have dug around within the get_transformations attribute a bit more. Thank you!!

@Alxmrphi thanks for your words of encouragement!

LukasMut commented 1 year ago

@ecktoh Wonderful. I am glad that we could help. Happy further feature extracting!