Open veronica320 opened 1 year ago
Hi, the v1.3 generation tools are provided in the develop
branch. We currently do not explicitly provide generated data, but the tools to generate it. However, if you require a fixed dataset for reproducibility, I could perhaps re-use our Huggingface dataset location and create a new dataset v1.3 in the org. However, since CLUTRR supports many configurations, it would be good if you can let me know in this issue which ones you would prefer first, and I'd be happy to generate them.
Thanks for the quick reply! Would it be possible for you to share the cleaned version for the following configuration from your google drive?
data_089907f8
- Train: k=2,3, Test: k=2,3,4,5,6,7,8,9,10I'm a bit busy with work, but I'll try to share the new data through Huggingface next week!
Generating the data following the instructions in develop branch should be a good starting point if you need the data immediately.
Hi authors, thanks for creating this great dataset! Would it be possible to share the "GPT3 cleaned data: CLUTRR v1.3" as mentioned in this blog post? This will save a lot of time for users to generate the data themselves and enable fair comparison of different methods on the same data. Thanks!