-
I was wondering how the trained models are intended to be evaluated. I don't believe that the paper states how many samples were used to compute the metrics. The code appears to give some indication b…
-
### Describe the feature
Dear OpenCompass Team,
I've encountered a challenge with OpenCompass when trying to evaluate a custom model that I developed. Currently, it seems that any action I want to…
-
excellent work!I'm writing to inquire about the possibility of adding support for multi-GPU evaluation to your evaluation framework. Currently, it seems that the existing evaluations are only designed…
-
Hi! We tried evaluating the base models using the starting kit evaluation pipeline. Here are some points/issues:
1. For phi2 and llama models, we are getting 'prediction not found' error.
2. Could …
-
There appears to be an issue with the `state-spaces/transformerpp-2.7b` model (in the `mamba` family of models) which causes a problem when generating (`Running generate_until requests`). This doesn't…
-
# Training, evaluating, and interpreting topic models | Julia Silge
At the beginning of this year, I wrote a blog post about how to get started with the stm and tidytext packages for topic modeling. …
-
Since there are now so many models on HF and it would be useful to understand how they perform on specific tasks or languages.
Lately I was trying to use https://github.com/EleutherAI/lm-evaluation…
Werve updated
2 months ago
-
Hi! I've started working on my own QG algorithm for my Masters thesis and I'm trying to learn how to evaluate a model.
Since you've posted your metrics, I've been trying to replicate them, but I a…
-
**Summary**
We used ChemSampler for a first round of generation of dCA analogues, which were evaluated by docking scores only. We still want to get more diversity of compounds so we have been working …
-
I have trained a model using supervised contrastive. I saved the model using -
`l2v.save('/llm2vec_models/final_merged_model', merge_before_save=True, save_config=True)`
Now when I try to run m…