huggingface / pixparse

Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
11 stars 3 forks source link

Patch size and window size are dropped #36

Closed molbap closed 10 months ago

molbap commented 10 months ago

At initialization of

{
    "image_encoder": {
        "name": "swin_base_patch4_window12_384",
        "image_fmt": "L",
        "image_size": [
            1920,
            1408
        ],
        "patch_size": 8,
        "window_size": 16
    },
    "text_decoder": {
        "name": "facebook/bart-large",
        "num_decoder_layers": 6,
        "max_length": 1024,
        "pad_token_id": null,
        "qk_norm_cross": true
    }
}

On the branch bart_custom I get

Dropping extra args {'patch_size': 8, 'window_size': 16}
molbap commented 10 months ago

solved in #35