facebookresearch / mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Other
7.2k stars 1.2k forks source link

resize to non-squared images #108

Open shuozhou opened 2 years ago

shuozhou commented 2 years ago

Thanks for the excellent work.

I tried to use a non-squared input image size since my data contains people only. While from patchify() it seems like the input is limited to squared ones?

daisukelab commented 2 years ago

HI @shuozhou, I'm just a user of this repo, but I guess my changes would help you.

I have extended this repo so that it accepts free width, height, and channel size, for example: https://github.com/nttcslab/msm-mae/blob/main/msm_mae/patch_msm_mae.diff#L196

You can find all the changes here: https://github.com/nttcslab/msm-mae/blob/main/msm_mae/patch_msm_mae.diff

Here is my repository that uses files from this MAE repo for our Masked Spectrogram Modeling. https://github.com/nttcslab/msm-mae

UdonDa commented 2 years ago

@daisukelab You mean after training a fixed image size, the model can accept a free input size, right? In addition, can your model accept a free mask ratio? I mean that if the model is trained with a 75% mask ratio, can the model accept other than 75% mask ratio? Thanks.

daisukelab commented 2 years ago

@UdonDa Why don't you visit https://github.com/nttcslab/msm-mae and check what is done there by yourself? ;) Our problem handles non-squared input and free mask ratio.

UdonDa commented 2 years ago

@daisukelab Sorry. I wanted to ask if my understanding is correct. Thanks.