-
SubhanAllah Brother,
MasaAllah.
First of all, great job.
I wanted to know if it would be possible to automatically check authenticity of hadith.
And where is the link for the Sanadset.
…
-
### Feature request
How can we take advantage of https://osu-nlp-group.github.io/Mind2Web/ (dataset at https://huggingface.co/datasets/osunlp/Mind2Web) ?
### Motivation
_No response_
-
Code to remove unnecessary punction, etc.
Resource:
https://towardsdatascience.com/nlp-in-python-data-cleaning-6313a404a470
- decide which datasets to run methods on
- run code on combined d…
-
Dataloader name: `okapi_m_truthfulqa/okapi_m_truthfulqa.py`
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?okapi_m_truthfulqa
| Dataset| okapi_m_truthfulqa |
|-------------…
-
We are missing a few datasets for Text Classification which is an important field.
Namely, it would be really nice to add:
- [x] TREC-6 dataset (see here for instance: https://pytorchnlp.readthedo…
-
## Adding a Dataset
- **Name:** Add pre-processed data to:
- *wikimedia/wikipedia*: https://huggingface.co/datasets/wikimedia/wikipedia
- *wikimedia/wikisource*: https://huggingface.co/datasets…
-
### Describe the bug
The link provided for the dataset is broken,
data_files =
[https://the-eye.eu/public/AI/pile_preliminary_components/PUBMED_title_abstracts_2019_baseline.jsonl.zst](url)
The…
-
1. Performed first two NLP tasks with Stanford's coreNLP library.
2. Top N wordcount with stemming, same caps, etc have been completed.
3. Finding datasets (with a not so good heuristic) works.
4. S…
-
## Introduction
I am writing to request a feature implementation for Haystack in Node.js. As a developer working extensively with JavaScript and Node.js environments, I find Haystack's capabilities…
-
你好,我无法找到文件: data_path=/wjn/nlp_task_datasets/kg-pre-trained-corpus/total_pretrain_kgicl_gpt,感觉看的有点模糊,麻烦指个路,谢谢!