LangChain Integration for Flow Judge

Summary

This PR introduces an integration between Flow Judge and LangChain, allowing users to leverage Flow Judge's custom metrics within LangChain workflows.

Key Changes

Created FlowJudgeLangChainEvaluator class on the integrations folder:
- Extends LangChain's StringEvaluator
- Enables use of Flow Judge metrics in LangChain pipelines
Added example notebook:
- Demonstrates usage of Flow Judge integration within LangChain
- Compares Flow Judge custom metrics with LangChain's built-in evaluators

Testing

The integration has been manually tested:

On linux Ubuntu 22.04.3 LTS
Verified functionality of FlowJudgeLangChainEvaluator with models Flow-Judge-v0.1-AWQ, Flow-Judge-v0.1, Flow-Judge-v0.1_HF and Flow-Judge-v0.1-AWQ-Async ( note Async functionality is not demonstrated in the notebook, but it was part of the standard stringevaluator class, so I added the option as well)
Verified functionality of FlowJudgeLangChainEvaluator with custom metric (in notebook), and built-in metrics (metric=RESPONSE_CORRECTNESS_BINARY, model=model)
Tested integration in example notebook, ensuring correct behaviour within LangChain workflows
Manually viewed results with LangChain's native evaluators to confirm accuracy and consistency

flowaicom / flow-judge

langchain tutorial #19

LangChain Integration for Flow Judge

Summary

Key Changes

Testing

Codecov Report