-
作者您好,
祝贺你们的工作被Findings of ACL 2024接受!
这篇工作的数据集准备部分给了我很大启发,我在自己合成instruction ft datasets的时候发现有部分步骤不太完整:
1. 在2.1 Graph Caption Generation 部分的 group (1) **Wikipedia + Wikidata5M** 中,我加载的是了wiki5…
-
**Scenario**
I would like to be able to search for datasets by bucket name. When I find code that is using the same data that I want to use I don't have any references to data.all registration, but I…
-
## 📝 Scenario
As a **FinOps practitioner**, I need to **ingest data into a queryable data store** in order to **report on data at scale beyond $5M/mo**
## 💎 Solution
Support large datasets (e…
-
I found the code in your provided link but not the datasets, can you share the processed datasets? Thank you!
-
I’m trying to download the ava_6_5000_all and kinetics_6_5000_all reduced datasets. When I use wget, I get a 401 Unauthorized error, and it’s asking for a username and password that weren’t provided.
…
-
Start with the following generation of an example dataset. Expand on it with more variables that include
- open responses (short and long answers, containing special symbols, html, etc. Stuff that c…
-
The package fails _**hard**_ when trying to work with large datasets (~20 million or so elements in my case). Trying to compute the entire distance matrix isn't possible without massive amounts of mem…
-
Hi,
Thanks for sharing the work. Would you mind also releasing the `EgoNaoDataset` class? The` data_preproceessing/datasets/` folder is missing. And it would be helpful if you can also release the …
-
Thanks for sharing your excellent work!
I am a newcomer in the field of multi-label text classification.
I don’t know where to download the train_data.json and test_data.json of Reuters-21578, also …
-
Can you provide data preprocessing code?Thank you very much