Open lhoestq opened 9 months ago
Hi Quentin, that's a good idea! We are on it, and will let you know once we've done this.
Cool ! Let me know if you have questions or if I can help
Hi Quentin, I uploaded our dataset here and modified the yaml to display the different configs as described here. I was trying to show three different configs for the main
data, the lfqa_random
data and the lfqa_domain
data. But the dataset viewer seems to not show these configs and their corresponding splits this way. Any chance you know what I could be missing? Thanks a lot!
I just opened a PR to fix a small issue with the YAML :) https://huggingface.co/datasets/cmalaviya/expertqa/discussions/1
Thanks, looks good now!! It would be nice if the main
subset could also be previewed, I currently see an Error code: UnexpectedError
. Let me know if I need to fix something.
I'm getting this error somehow:
pyarrow.lib.ArrowInvalid: JSON parse error: Column(/answers/post_hoc_gs_gpt4/claims/[]/revised_evidence) changed from string to array in row 0
It looks like a field is sometimes a string and sometimes an array in the JSON data. However the dataset viewer only supports fixed types per field. Is this an error in the data file or it's expected ?
Ah that's because when the revised_evidence
field is empty, it was stored as an empty list when it is otherwise always a string.
I fixed this in an updated file, but there is still an Unexpected error
. Let me know if the error is something different. Also I wonder if I can test with the parquet converter myself. Thanks in any case!
It seems that some examples have the gpt4
field but other don't
Hi ! I’m Quentin from HF :)
Thanks for sharing the dataset, I believe it will be used a lot to evaluate LLMs! Especially since factual correctness and attributions are imo at the heart of many challenges nowadays.
I was wondering if you planned to share the dataset on Hugging Face ? This way researchers can load it in one line of python, and there is also a nice dataset viewer on the website to visualize the data.