facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.81k stars 560 forks source link

Inconsistent model state with released BASE Model (Small Model OK) #158

Open kairan77 opened 11 months ago

kairan77 commented 11 months ago

nougat_model: NougatModel = NougatModel.from_pretrained("local-dir-path-to-original-base") nougat_model.eval() nougat_decoder = nougat_model.decoder.model nougat_decoder.save_pretrained("local-dir-path-to-decoder-model") nougat_decoder: MBartForCausalLM = MBartForCausalLM.from_pretrained("local-dir-path-to-decoder-model")

the DECODER model after the save/load round trip would no longer work correctly for the Released BASE NOUGAT model,

however the same decoder model from the released SMALL NOUGAT model works correctly after the same round trip.

Is this difference expected or is it due to some misconfiguration during the original base model export?