truera / trulens

Evaluation and Tracking for LLM Experiments
https://www.trulens.org/
MIT License
2.07k stars 181 forks source link

TruLens Docs (Feedback Functions) Graphic Improvement Suggestion #1198

Closed DrewGalbraith closed 1 week ago

DrewGalbraith commented 3 months ago

I'm not sure if this is the right place to suggest enhancements for the TruLens docs, but while this graphic is already helpful in explaining the surrounding text, it could be stronger.

Suggestion

The graphic would be clear if the 'Meaningful' axis was a finite scale. It should range from the least meaningful evaluation (random binary classification, i.e. hallucination/not hallucination) to the most meaningful possible evaluation (a representative ground truth), approximately scaling everything else in between. Additionally, the upper bound of 'Scalable' would likely be that same random classification; I don't see a lower bound for this axis. The point placement along axes wouldn't feel so arbitrary if this finite scaling was worked into the graphic.

Disclaimer/Justification

I understand the most important part of the graphic is the general relationship of each point to each other as opposed to their placement relative to the scales. I contend with that in two ways:

  1. Given the points aren't equidistant, especially along 'Scalable', there does seem to be an effort to communicate size of disparity as opposed to just general direction.
  2. Having 'Ground Truth Evals' sitting 2/3 along the image is unclear, suggesting there are even more informative/meaningful evaluations for a use-case not listed on this graphic. There aren't though, assuming ground truth is composed correctly.
sfc-gh-jreini commented 3 months ago

Thanks for the feedback @DrewGalbraith - we'll take this into consideration