you can specify a needle in haystack (gittables) dataset along with other eval datasets.
raise error if only needle in haystack is included and no other datasets.
input examples:
you can do TARGET(("Table Retrieval Task", ["fetaqa", "gittables"]))
but CANNOT do TARGET(("Table Retrieval Task", ["gittables"]))
you can also specify how many tables you need from the nih datasets. you can use the num_tables field. for example:
from target_benchmark.dataset_loaders.TargetDatasetConfig import (
DEFAULT_FETAQA_DATASET_CONFIG,
DEFAULT_GITTABLES_DATASET_CONFIG,
)
from target_benchmark.evaluators import TARGET
from target_benchmark.tasks import TableRetrievalTask
- I've added a new issue [here](https://github.com/target-benchmark/target/issues/33#issue-2639650516) to address some issues i have with NIH right now, but want to get this pr done first and improve in the future.
Create Needle in Haystack
TARGET(("Table Retrieval Task", ["fetaqa", "gittables"]))
TARGET(("Table Retrieval Task", ["gittables"]))
num_tables
field. for example:gittables_config = DEFAULT_GITTABLES_DATASET_CONFIG.model_copy() gittables_config.num_tables = num_tables TARGET( TableRetrievalTask( { "fetaqa": DEFAULT_FETAQA_DATASET_CONFIG, "gittables": gittables_config, } ) )