Closed dhuynh95 closed 5 days ago
Datasets now uploading in classic parquet format:
Raw dataset: https://huggingface.co/datasets/BigAction/the-meta-wave-raw Pre-processed dataset for Retriever evaluation: https://huggingface.co/datasets/BigAction/the-meta-wave-rewritten Retrieved dataset for LLM evaluation: https://huggingface.co/datasets/BigAction/the-meta-wave-retrieved
@HiImMadness : The dataset that we use for eval, The Wave 250, is broken:
Please always ensure datasets are working.
Also, it would be ideal if you upload a different dataset that contains the nodes of our best retriever, with metadata about this retriever if we want to make things more reproducible, so we can directly evaluate an LLM without having to rerun the retriever. Obviously, these examples must contain the ground truth elements to make sure the LLM can find the solution.
Something like this would be ideal:
Also, optionally it might be better to provide Full XPath to be more consistent as the way to select the ground truth element. I see sometimes different selectors, which can work but be ambiguous or inconsistent in some scenarios.
Todo: