-
Isotope Version 2.9.0
For sales statistics
![image](https://github.com/user-attachments/assets/a1b980a3-94cb-415a-a657-d8be67018436)
but at Sales summary
![image](https://github.com/user-attac…
-
Could you please share the evaluation scripts and prompts that were used to generate the reported results in the paper?
Various parameters are involved in generating outputs, and it is crucial to …
-
Trying to reproduce evaluation numbers but not able to.
Ex : For gemma-2-9b, the technical report mentions 68.2 on BBH 3 shot CoT while the open llm [leaderboard](https://huggingface.co/spaces/ope…
-
-
Hi, have you evaluated the model using only GPT3.5/Claude without HBR? This is important for the research community to compare against your work.
-
Hello! Thanks so much for fixing the bugs in the Llava-uhd repo. I was wondering if anyone was able to reproduce the evaluations that Llava-uhd got in their paper. I also noticed that my pretraining c…
-
Here's an example of a single site evaluation of rainfall-driven runoff events: https://github.com/jarq6c/little_hope/blob/main/teehr-events/single_site.ipynb
The goal of this evaluation was to iso…
-
Armory provides little information to the console while executing evaluations. Adding `INFO` level logging at major step such as model and dataset loading, as well as chain evaluation would make it ea…
-
To generate a PheWAS plot, we need to run https://github.com/EngreitzLab/gene_network_evaluation/blob/main/src/plotting/plot_gwas_enrichment.py#L10.
To avoid extra computation in the dashboard, we …
-
This ADR will encompass all of the lessons learned and decisions made on how we will handle RAG-focused LLM evaluations.
These issues are encompassed in the [RAG Evaluations MVP Epic](https://githu…