huggingface / pixparse

Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
13 stars 3 forks source link

Pablo/add donut comparison task #5

Closed molbap closed 1 year ago

molbap commented 1 year ago

What does this PR do?

molbap commented 1 year ago

This adds a breaking change where we now have to specify which task we are launching in the main interface.

molbap commented 1 year ago

I think this is up for merge. @rwightman the main change is that you have to specify which task you want to launch. For training it's

python -m pixparse.app.train \
  --eval.task-name cruller_pretrain \

And for instance to launch an eval

python -m pixparse.app.eval \
  --eval.task-name cruller_eval_ocr \
molbap commented 1 year ago

amending that... refactoring the task_factory method. Not ready to merge yet.

molbap commented 1 year ago

This PR has drifted a bit. Added a common task-name argument and a TaskFactory.

class TaskFactory:
    """
    This class registers existing tasks and propagates corresponding configurations.
    """
    TASK_CONFIG_REGISTRY = {
        'cruller_eval_ocr': TaskCrullerEvalOCRCfg,
        'donut_eval_ocr': TaskDonutEvalOCRCfg,
        'cruller_pretrain': TaskCrullerPretrainCfg
    }

    TASK_CLASS_REGISTRY = {
        'cruller_eval_ocr': TaskCrullerEvalOCR,
        'donut_eval_ocr': TaskDonutEvalOCR,
        'cruller_pretrain': TaskCrullerPretrain
    }

Makes the interfaces

app/train.py
app/eval.py

more generic and less specific to cruller_pretrain.

So that for pretraining cruller one has to now add

python -m pixparse.app.train \
  --task-name cruller_pretrain \

tested on the cluster.

rwightman commented 1 year ago

This PR has drifted a bit. Added a common task-name argument and a TaskFactory.

class TaskFactory:
    """
    This class registers existing tasks and propagates corresponding configurations.
    """
    TASK_CONFIG_REGISTRY = {
        'cruller_eval_ocr': TaskCrullerEvalOCRCfg,
        'donut_eval_ocr': TaskDonutEvalOCRCfg,
        'cruller_pretrain': TaskCrullerPretrainCfg
    }

    TASK_CLASS_REGISTRY = {
        'cruller_eval_ocr': TaskCrullerEvalOCR,
        'donut_eval_ocr': TaskDonutEvalOCR,
        'cruller_pretrain': TaskCrullerPretrain
    }
  • 2 class methods to generate a task configuration, and a task. New tasks (finetuning, warmup, etc) have to be added to the factory, then, specific methods should be defined within the new task specific classes.

Makes the interfaces

app/train.py
app/eval.py

more generic and less specific to cruller_pretrain.

So that for pretraining cruller one has to now add

python -m pixparse.app.train \
  --task-name cruller_pretrain \

tested on the cluster.

Cool, that's looking good so far, looks like we may end up merging the task config & class registry? Or perhaps the task class should have a classmethod to get the config class type?

molbap commented 1 year ago

For now it looks like TaskCfg and Task have always a 1:1 match. The two classes have different roles and their merit in existing independently, but the factory + create_task can instantiate one task and its config.

Now the registry is

    TASK_CLASS_REGISTRY = {
        'cruller_eval_ocr': (TaskCrullerEvalOCR, TaskCrullerEvalOCRCfg),
        'donut_eval_ocr': (TaskDonutEvalOCR, TaskDonutEvalOCRCfg),
        'cruller_pretrain': (TaskCrullerPretrain, TaskCrullerPretrainCfg)
    }

tested on the cluster for cruller_pretrain and cruller_eval_ocr.