issues
search
citadel-ai
/
langcheck
Simple, Pythonic building blocks to evaluate LLM applications.
https://langcheck.readthedocs.io/en/latest/index.html
MIT License
186
stars
17
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[nit] Remove yapf
#167
liwii
closed
18 hours ago
0
Async call of OpenAISimilarityScorer is slower than sync version
#166
Vela-zz
opened
2 days ago
0
Bump version to 0.8.0
#165
liwii
closed
2 weeks ago
0
Fix issues in docs
#164
liwii
closed
2 weeks ago
0
Drop Python 3.8 support
#163
liwii
closed
2 weeks ago
2
Use `EvalClient` in `langcheck.augment.rephrase`
#162
taniokay
closed
3 weeks ago
3
Add [python] to format setting
#161
taniokay
closed
1 month ago
0
Async call of OpenAISimilarityScorer is slower than sync version
#160
taniokay
opened
1 month ago
3
Support Async API for embedding based metrics
#159
taniokay
closed
1 month ago
8
Fix typo
#158
taniokay
closed
1 month ago
0
Update the interface of `langcheck.augment.rephrase`
#157
liwii
closed
3 weeks ago
0
Support async OpenAI clients for embedding-based metrics
#156
liwii
closed
1 month ago
0
Add no-local-llm to avoid vllm installation
#155
taniokay
closed
1 month ago
1
Add nltk download in langcheck.augment.synonym
#154
taniokay
closed
1 month ago
2
langcheck.augment.synonym doesn't work because of some missing nltk package
#153
liwii
closed
1 month ago
1
Support other types of parameters
#152
liwii
closed
1 month ago
0
Review of "Refactor metric inputs"
#151
yosukehigashi
closed
2 months ago
0
Fix a typo in the augmentation template
#150
liwii
closed
2 months ago
0
Let users access the prompt & score_map objects of the built-in eval client metrics
#149
liwii
opened
2 months ago
0
Implement Simulated Annotators for estimating confidence scores for pairwise comparison
#148
conan1024hao
closed
2 months ago
9
Refactor metric inputs
#147
liwii
closed
2 months ago
2
Bump version to 0.8.0.dev6
#146
yosukehigashi
closed
2 months ago
0
Update docs to reflect new metric structure
#145
yosukehigashi
closed
1 month ago
6
Versioning of eval prompts
#144
yosukehigashi
closed
2 months ago
6
Update nltk to 3.9
#143
yosukehigashi
closed
2 months ago
0
Upgrade ruff to v0.6
#142
yosukehigashi
closed
2 months ago
0
Remove benchmarking dir
#141
yosukehigashi
closed
2 months ago
0
Upgrade `ruff` to 0.6
#140
yosukehigashi
closed
2 months ago
0
Fix typo in documentation
#139
kennysong
closed
3 months ago
0
Add safety related built-in metrics
#138
liwii
closed
2 months ago
2
More Augmentations
#137
liwii
closed
3 months ago
7
Update the toxicity metric [en, ja]
#136
conan1024hao
closed
3 months ago
3
SwallowEvalClient → LlamaEvalClient
#135
conan1024hao
closed
3 months ago
0
Improve Swallow system prompt
#134
conan1024hao
closed
3 months ago
3
Fix the bug that langcheck doesn't work when [local-llm] is not installed
#133
conan1024hao
closed
3 months ago
1
Template based jailbareak augmentation
#132
liwii
closed
3 months ago
1
Fix text encoding
#131
conan1024hao
closed
4 months ago
0
Answer correctness metric
#130
yosukehigashi
closed
4 months ago
1
Custom pairwise evaluator metric
#129
yosukehigashi
closed
4 months ago
0
Implement the Swallow Evaluation Client
#128
conan1024hao
closed
4 months ago
0
Improve the stability of metrics by repeated queries
#127
liwii
opened
4 months ago
0
[WIP] Add custom prompts for the Prometheus model
#126
conan1024hao
closed
4 months ago
0
Introduce Ruff as the formatter
#125
liwii
closed
4 months ago
3
Return `None` if the function calling step returns an invalid assessment
#124
yosukehigashi
closed
5 months ago
0
Handle `None` sources in the pairwise comparison metric
#123
yosukehigashi
closed
5 months ago
0
Add Prometheus Eval Client
#122
conan1024hao
closed
4 months ago
4
Bump version to 0.8.0.dev2
#121
yosukehigashi
closed
5 months ago
2
Custom evaluator metric
#120
yosukehigashi
closed
5 months ago
0
Bump version to 0.8.0.dev1
#119
yosukehigashi
closed
6 months ago
2
Update prompts to output the chain-of-thought reasoning first
#118
yosukehigashi
closed
6 months ago
0
Next