Sense-GVT / DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
630 stars 31 forks source link

About the BPE file #15

Closed kugwzk closed 2 years ago

kugwzk commented 2 years ago

Hi~ @zlccccc @SlotherCui I notice that there isn't BPE file here. In your token embedding weight, the shape is [49409, 512], but the shape in CLIP is [49408, 512]. Are yours BPE file consistent with CLIP? If I missed something, please comment~ Thanks a lot!

zlccccc commented 2 years ago

We add an '<[mask]>' token to perform Masked-Language-Modeling in the language self-supervision. Please refer to: https://github.com/Sense-GVT/DeCLIP/blob/main/prototype/model/utils/text_utils/simple_tokenizer.py#L73 https://github.com/Sense-GVT/DeCLIP/blob/main/prototype/model/text_encoder/text_transformer.py#L38

SlotherCui commented 2 years ago

@kugwzk You can download the BPE file at here: https://github.com/Sense-GVT/DeCLIP/blob/main/docs/dataset_prepare.md