-
I was wondering if you still have the dataset you used to create the deep learning graphs in your paper. I think these datasets can be a very interesting benchmark. The deep offline RL space is curren…
-
Hi,
i was wondering if anyone has ever retrained OpenFold on a small dataset, e.g. Cath (4.2) to see how well/bad it does at small scale. I am curious for benchmarking -- to compare a new folding me…
-
**Describe the problem**
Currently, adding new query or task types to the code requires manual updates in multiple places. For example, the following code snippet illustrates that the `query_type` fi…
-
Hi,
Thank you for sharing your work! I have a few questions regarding the grid search in the VTAB-1k benchmark and would greatly appreciate it if you could provide more details:
1. Did you use a…
-
On a recent study (https://dl.acm.org/doi/abs/10.1145/3597312) I've noticed that the difference between the top-N (N = 15 or more) algorithms in most datasets are insignificant. They only differ on a …
-
Hi,
I need to download the datasets such as "YFCC-10M + CLIP" to compare between different vector databases.
Where can I find the download links?
-
I am trying to evaluate GritLM-7B on MTEB datasets using the provided script.
```
#!/bin/bash
python /home/e/e1347696/unified_encoder_decoder/src/eval/MTEB/eval_mteb.py \
--model_name_or_pat…
-
MMA-Diffusion-NSFW-adv-prompts-benchmark: I want to use this dataset for some research, especially to see if it can be used to generate ADV attachments, but my request was rejected on the huggingface.
-
Hi!
Thanks for your excellent work! Can I ask if you benchmark on the Mip-360 Dataset? If yes, could you please provide your benchmarking results? Thanks in advance!
-
it's unclear how to test data sets other than the random data set. It would also be nice to have a flag for generating more uniform/bias data in cases of a speculative decoding models where we would l…