Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.01k
stars
231
forks
source link
how to approach model distillation, for creating a smaller + faster model #148
I am interested a implementation of model knowledge distillation for this specific model. This technique will allows us to transfer the valuable knowledge and performance of a larger, resource-intensive model (the "teacher") to a smaller, more lightweight counterpart (the "student").
Any inputs from the community on this will be really helpful. How should I approach this problem?
I am interested a implementation of model knowledge distillation for this specific model. This technique will allows us to transfer the valuable knowledge and performance of a larger, resource-intensive model (the "teacher") to a smaller, more lightweight counterpart (the "student").
Any inputs from the community on this will be really helpful. How should I approach this problem?
PS- I got this idea from PaddleStructure v2, where they used FGD [Focal and Global Knowledge Distillation for Detectors] - for model size reduction ; source; https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/ppstructure/docs/models_list_en.md