opea-project / GenAIEval

Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination
Apache License 2.0
22 stars 40 forks source link

Restructure genAI Eval to address evaluation of multiple categories of metrics #75

Open Padmaapparao opened 3 months ago

Padmaapparao commented 3 months ago

Restructure the genAI_Eval to categorize the various Evaluation criteria. Need to have different Repos for each category so it makes it easier to users to find what they are looking for.

Example: All code relating to perf and BKMS /scripts into the perf section

Performance Trustworthiness Scalability Safety Security

Padmaapparao commented 3 months ago

@hshen14 Please prioritize this request