This PR aims to enhance support for evaluation metrics by implementing a column-based version, consolidating multiple actual outputs into a single trace, and updating the README to reflect the changes in the evaluation process. Additionally, it includes the removal of outdated evaluation code to streamline the system.
This PR aims to enhance support for evaluation metrics by implementing a column-based version, consolidating multiple actual outputs into a single trace, and updating the README to reflect the changes in the evaluation process. Additionally, it includes the removal of outdated evaluation code to streamline the system.
Screenshots