masked-image-modeling Search Results

huggingface/transformers #34280

cross attention mask is always zeros in mllama

### System Info transformers==4.45.2 when preparing the cross_attention_mask in ```_prepare_cross_attention_mask``` function we get the``` cross_attn_mask``` to the shape of [batch,text_tokens,i…

xgal updated 3 days ago

X-LANCE/SLAM-LLM #167

[Question] How do I inferece on custom data with pretrained …

### System Info PyTorch version: 2.6.0.dev20241101+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (Ub…

ylee1123 updated 1 week ago

5g4s/paper #26

SimMIM: a Simple Framework for Masked Image Modeling

https://openaccess.thecvf.com/content/CVPR2022/papers/Xie_SimMIM_A_Simple_Framework_for_Masked_Image_Modeling_CVPR_2022_paper.pdf

5g4s updated 1 year ago

ouusan/some-papers #28

Boosting Efficiency

1.Revitalizing optimization for 3d human pose and shape estimation: A sparse constrained formulation(2021) code:No 2.Body meshes as points(2021) regared as a two class classification task(if a grid…

ouusan updated 5 days ago

microsoft/unilm #969

extending VLMO with MIM (Masked Image Modeling) loss

Thank you for sharing the source code of VLMO recently. We took a stab and pretrained a large (1024 hidden dim) multiway transformer with mim loss, mlm loss, and contrastive loss. BEIT3 pret…

jinxixiang updated 1 year ago

PKU-YuanGroup/LanguageBind #67

ValueError: Input image size (112*1036) doesn't match model …

使用最新的transformers 4.47.0.dev0 删除 improt _expand_mask 改为自定义 def _expand_mask(mask: torch.Tensor, dtype: torch.dtype, tgt_len: Optional[int] = None): """ Expands attention_mask from `[bs…

JeffRody updated 1 week ago

LTH14/mar #57

mask generation for training and inference with MAR

Hi author, thanks for the great work and the general concept about 'masked autoregressive'. When I try to use the model provided, I find that during training, the mask is generated randomly without an…

YOU-k updated 1 month ago

LTH14/mar #30

Difference Between MAR and MAGE

Hi Tianhong, thank you for your inspiring work! While reading the paper, I had some questions regarding the term “MAR.” Aside from the difference mentioned in the paper—where the next set of tokens in…

JeremyCJM updated 2 months ago

ouusan/some-papers #23

Exploiting Temporal Information

1.(HMMR) Learning 3d human dynamics from video(2019) temporal encoder: **1D temporal** convolutional layers, **precompute** the image features on each frame, get current and ±∆t frames prediction. c…

ouusan updated 1 month ago

facebookresearch/NeuralCompression #189

Implement PQ-MIM compression paper

We would like to have an implementation of the following paper: [Image Compression with Product Quantized Masked Image Modeling](https://arxiv.org/abs/2212.07372) Alaaeldin El-Nouby, Matthew J. Mu…

mmuckley updated 7 months ago

517 results for masked-image-modeling

517 results
for masked-image-modeling