yujonglee / eval

Evaluate your LLM apps, RAG pipeline, any generated text, and more!
MIT License
0 stars 0 forks source link

Make sure our built-in evaluator works properly #102

Closed yujonglee closed 1 year ago

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 75.00% and project coverage change: +0.70% :tada:

Comparison is base (e1e3942) 82.75% compared to head (810b7dc) 83.45%. Report is 12 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #102 +/- ## ========================================== + Coverage 82.75% 83.45% +0.70% ========================================== Files 30 29 -1 Lines 690 671 -19 ========================================== - Hits 571 560 -11 + Misses 119 111 -8 ``` | [Files Changed](https://app.codecov.io/gh/fastrepl/fastrepl/pull/102?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fastrepl) | Coverage Δ | | |---|---|---| | [fastrepl/pytest\_plugin.py](https://app.codecov.io/gh/fastrepl/fastrepl/pull/102?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fastrepl#diff-ZmFzdHJlcGwvcHl0ZXN0X3BsdWdpbi5weQ==) | `80.00% <50.00%> (+3.07%)` | :arrow_up: | | [fastrepl/llm.py](https://app.codecov.io/gh/fastrepl/fastrepl/pull/102?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fastrepl#diff-ZmFzdHJlcGwvbGxtLnB5) | `88.88% <100.00%> (+6.19%)` | :arrow_up: | ... and [2 files with indirect coverage changes](https://app.codecov.io/gh/fastrepl/fastrepl/pull/102/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fastrepl)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

yujonglee commented 1 year ago

Will fix the Github App and come back.

fastrepl[bot] commented 1 year ago
EVAL MODEL ACCURACY MAE MSE
LLMGradingHead gpt-3.5-turbo 0.33 0.93 1.53
LLMGradingHead togethercomputer/llama-2-70b-chat 0.63 0.37 0.37

https://app.fastrepl.com/run/03af416e828a4ad89516369b7e5e23b1

krrishdholakia commented 1 year ago

hey @yujonglee how is cost part of the eval?

yujonglee commented 1 year ago

@krrishdholakia I added it to proxy but not added to github-app yet, so not included in comment :)

fastrepl[bot] commented 1 year ago
Details Log: https://app.fastrepl.com/run/b108fcb5249f47e1bb986e37528875d9 Cost: {}
fastrepl[bot] commented 1 year ago
Details Log: https://app.fastrepl.com/run/e6f33ca12827484f9ebe3ff66503b0c5 Cost: {}