langchain-ai / langsmith-sdk

LangSmith Client SDK Implementations
https://docs.smith.langchain.com/
MIT License
405 stars 77 forks source link

Langsmith Evaluation- Custom LLM Evaluator Not Showing Scores on Lagsmith #351

Closed daniellefranca96 closed 9 months ago

daniellefranca96 commented 9 months ago

Issue you'd like to raise.

When i try to customize the LLM running the evaluation, i get the test to run without failling but it did not save the scores in Langsmith like it normaly does when i run with GPT4, how do i fix this or get access to these scores?

Collab: https://colab.research.google.com/drive/1J_1u5vNQp9hTm2VQLhd4JTQbci9UeE1J?usp=sharing

Suggestion:

No response

hinthornw commented 9 months ago

It looks like you double-nested a function get_eval_config so it actually returns "None" rather than the intended evaluation config.

def get_eval_config():
  def get_eval_config() -> RunEvalConfig:
    """Returns the evaluator for the environment."""
    eval_llm = llm

    return RunEvalConfig(
        evaluators=[
            RunEvalConfig.LabeledScoreString(
                criteria=rag_eval_config._ACCURACY_CRITERION, llm=eval_llm, normalize_by=10.0
            ),
            RunEvalConfig.EmbeddingDistance(),
        ],
        custom_evaluators=[rag_eval_config.FaithfulnessEvaluator(llm=eval_llm)],
    )

should be

def get_eval_config() -> RunEvalConfig:
  """Returns the evaluator for the environment."""
  eval_llm = llm

  return RunEvalConfig(
      evaluators=[
          RunEvalConfig.LabeledScoreString(
              criteria=rag_eval_config._ACCURACY_CRITERION, llm=eval_llm, normalize_by=10.0
          ),
          RunEvalConfig.EmbeddingDistance(),
      ],
      custom_evaluators=[rag_eval_config.FaithfulnessEvaluator(llm=eval_llm)],
  )
daniellefranca96 commented 9 months ago

Thanks a lot for this, i would have never seen it or it would take me ages!!

hinthornw commented 9 months ago

Happy to help! Going to close this one as solved.