facebookresearch / hiera

Hiera: A fast, powerful, and simple hierarchical vision transformer.
Apache License 2.0
717 stars 36 forks source link

n_output_channels #32

Closed liushawn618 closed 1 month ago

liushawn618 commented 1 month ago

Hello,

Congrats on the great work again. What's the n_output_channels and downsample_rate of hiera_base_224?

dbolya commented 1 month ago

Screenshot_2024-06-04-11-20-11-38_e2d5b3f32b79de1d45acd1fad96fbb0f

B has 768 output channels with a 32x total downsample rate (224 -> 7).

liushawn618 commented 1 month ago

Thank you for your timely reply. After changing n_output_channels into 768, I get RuntimeError: mat1 and mat2 shapes cannot be multiplied 64x1000 and 768x512. And then I just change it like this image Now it's running, but is it ok to do like this?

dbolya commented 1 month ago

Can you share which script you're trying to use this with? Whether that's correct depends on what those variables specifically mean in that script.

liushawn618 commented 1 month ago

I use this script "python scripts_method/train.py --demo --backbone hiera --setup p1 --method arctic_sf --trainsplit train --valsplit minival"

dbolya commented 1 month ago

I use this script "python scripts_method/train.py --demo --backbone hiera --setup p1 --method arctic_sf --trainsplit train --valsplit minival"

In which repository is this?

liushawn618 commented 1 month ago

I want to use hiera to extract features in this repository: https://github.com/zc-alexfan/arctic

liushawn618 commented 1 month ago

The result seems not very good image

liushawn618 commented 1 month ago

The model is this https://github.com/zc-alexfan/arctic/blob/master/src/models/arctic_sf/model.py

And I changed 3 linesimage

dbolya commented 1 month ago

Can you push your changes to a local repository? From what I can tell, 768 is the correct number of final features, so something may be going wrong elsewhere.

liushawn618 commented 1 month ago

Thank you for your kind response. My local respository is https://github.com/liushawn618/hiera_arctic_sf

liushawn618 commented 1 month ago

The yellow line shows hiera as backbone. It seems to perform well during training but not so good during validation. 截屏2024-06-05 下午2 46 37

截屏2024-06-05 下午2 47 11 截屏2024-06-05 下午2 48 04
dbolya commented 1 month ago

Ah, I see what's going on. You're running Hiera in classification mode, not feature extraction. The backbone is returning an imagenet classification (batch x n_classes, which is why you see 64x1000).

The correct parameters are n_output_channels of 768, downsample_rate of 4 (though this seems to be unused) and to revert the changes to the second argument to head_r, head_l, head_o.

Then to get intermediate features for Heira, replace the backbone evaluation with:

if self.args.heira:
    _, features = self.backbone(images, return_intermediates=True)
    features = features[-1]  # Get features from last stage
else:
   features = self.backbone(images)
liushawn618 commented 1 month ago

Thank you so much! I try again with your help and now it's on the right track! 截屏2024-06-05 下午7 58 21

dbolya commented 1 month ago

Great! Feel free to reopen this issue if you need more help.