hyungkwonko / chart-llm

Vega-Lite Chart Dataset and NL Generation Framework using LLMs
https://hyungkwonko.info/chart-llm/
MIT License
102 stars 8 forks source link

Where is the dataset with the generated NL prompts to charts? #2

Closed cpcdoy closed 1 year ago

cpcdoy commented 1 year ago

Hi,

Awesome work on this, I looked quite a bit in the repo and couldn't find the generated NL prompts, questions, etc for the 1981 charts that you mention in the paper. I only found the description field sometimes filled in the chart and some example runs, but that's it.

Could you direct me to the final dataset?

Thanks!

hyungkwonko commented 1 year ago

Hi @cpcdoy, thanks for your interests in our work.

We do not provide the generated NL datasets, but only introduce how to make them using Vega-Lite specs.

Although you can find sample results for 48 charts that are selected via stratified sampling: https://github.com/hyungkwonko/chart-llm/blob/main/exp/gold/result/gold.csv

You can generate the L1/L2 captions, utterances, questions using Vega-Lite and our prompting technique: https://github.com/hyungkwonko/chart-llm#example-nl-dataset-generation

The main reason we only test it on 48 samples is, you know, it costs a lot.

Let me know if you have any further question.