huggingface / pixparse

Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
11 stars 3 forks source link

A major refactoring to cleanup tech debt, reduce code redundancy #24

Open rwightman opened 11 months ago

rwightman commented 11 months ago

Still a WIP, pretrain 'should' work. Most other things broken.

I'm still fighting the interplay between 'ModelCfg' (the architectural specification for the model), and the params/arguments that can be passed through from command line which select the config (by name), and possibly override aspects of it... similar possible relationship for the tokenizer.

The other largest battle not won is setting up the tokens for tokenizers. I'd like task configs to have their tokens in the tasks specific config for pretrain + finetune. Need to load the tokens, adjust vocab size in step with model creation and loading of pretrained weights... but don't want it to be brittle section of cut & paste code. Hoo humm