Currently, InstructLab does not publish any metrics per taxonomy leaf node. We would like to explore different ways we can evaluate the InstructLab model being fine tuned via the taxonomy approach and come up with metrics to evaluate each of the taxonomy leaf nodes.
Currently, InstructLab does not publish any metrics per taxonomy leaf node. We would like to explore different ways we can evaluate the InstructLab model being fine tuned via the taxonomy approach and come up with metrics to evaluate each of the taxonomy leaf nodes.
Each leaf node in the taxonomy represents one particular skill, or set of knowledge. Here is one example: https://github.com/instructlab/taxonomy/blob/main/compositional_skills/linguistics/complete_common_expressions/qna.yaml. We see that each leaf has question and answer pairs. We would like to track how many questions from these yaml files the model answers correctly (for some definition of correctness), over time.