Add optional logging of text output to EvalOutputLogging

mosaicml / llm-foundry

LLM training code for Databricks foundation models

https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

Apache License 2.0

3.83k stars 502 forks source link

Add optional logging of text output to EvalOutputLogging #1283

Closed sjawhar closed 2 days ago

sjawhar commented 2 weeks ago

Adds the ability to log text output using EvalOutputLogging in non-metrics/ICL use cases. This appears as a new outputs key in the logged dictionary, which is set to stats.outpus and de-tokenized when the model is a HuggingFaceModel and the dataset has a tokenizer.

mvpatel2000 commented 2 weeks ago

@maxisawesome can you please take a look?

CC: @dakinggg

sjawhar commented 2 weeks ago

YES!! It's finally working! The logging, that is. Not my fine-tuning, that's still garbage. But I can finally start debugging!

maxisawesome commented 2 weeks ago

Looks great other than the one comment I left! Thanks for fixing this for non-generative evals.

mvpatel2000 commented 2 weeks ago

@sjawhar if you can fix lint + unit tests (the CPU ones) that would be awesome! then we can merge

sjawhar commented 1 week ago

Anything else I can do to help get this merged?

dakinggg commented 1 week ago

@sjawhar sorry about that, will take a look this week!

sjawhar commented 3 days ago

LGTM, please update the PR title and description to reflect the changes in this PR. Thank you!

Done

dakinggg commented 3 days ago

@sjawhar looks like the tests failed with the recent change