Postprocess: task-specific evaluation

onurgu commented 10 months ago

It's necessary to perform task and dataset-specific postprocessing, similar to the existing preprocessing steps. This improvement should be integrated within the current postprocess_text function, available here:https://github.com/boun-llm/t5-tuner/blob/c4d4e9cfc416e871bb74bcbc1b243a0fd70f0a49/src/utils.py#L189

The modifications will align with the structure and requirements of postprocessing, mirroring the preprocessing approach. The current usage of the postprocess_txt function is demonstrated in finetune.py: https://github.com/boun-llm/t5-tuner/blob/c4d4e9cfc416e871bb74bcbc1b243a0fd70f0a49/src/finetune.py#L97

The main task is to create and implement postprocess_dataset_task functions for each distinct dataset and task combination requiring postprocessing. This crucial step ensures the model's outputs are correctly processed, meeting the specific requirements of each dataset-task pair.

zeynepyirmibes commented 9 months ago

Added postprocess and task-specific evaluation for NLI and STS datasets in PR #15.

gokceuludogan commented 9 months ago

With new structuring, postprocess functions must be implemented in particular dataset classes. Once it is implemented, the task specific metrics must be included in load_task_metrics https://github.com/boun-llm/turkish-lm-tuner/blob/4b2e91df04e928473cd0ec355ac13ca6c656774e/src/metrics.py#L148-L168

gokceuludogan commented 8 months ago

Closing this as it has already been completed.

boun-tabi-LMG / turkish-lm-tuner

Postprocess: task-specific evaluation #7