allenai / vila

Incorporating VIsual LAyout Structures for Scientific Text Classification
Apache License 2.0
173 stars 16 forks source link

AssertionError: These char IDs get dropped in huggingface #37

Open rdmpage opened 1 year ago

rdmpage commented 1 year ago

When trying to train using the command ./train_ivila.sh grotoap2 row BLK microsoft/layoutlm-base-uncased I get the following error:

AssertionError: These char IDs get dropped in huggingface: {63193}.
Dont forget to add: ['Co'] categories to unicode replacement

I confess that I don't understand what the ['Co'] categories are, nor where I should add them.

msaoudallah commented 1 year ago

i'm getting same error, did you manage to solve it @rdmpage ?

rdmpage commented 1 year ago

@msaoudallah No, so I've given up :(