issues
search
AI4Bharat
/
indicnlp_catalog
A collaborative catalog of NLP resources for Indic languages
https://ai4bharat.github.io/indicnlp_catalog
552
stars
79
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Punctuation Dataset
#253
haridassaiprakash
opened
2 weeks ago
0
Indian Name Dataset
#252
ramSeraph
opened
1 month ago
1
Update README.md
#251
KaifAhmad1
opened
6 months ago
0
Update README.md
#250
KaifAhmad1
opened
6 months ago
2
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages
#249
anoopkunchukuttan
opened
7 months ago
0
L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages
#248
anoopkunchukuttan
opened
8 months ago
0
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation
#247
anoopkunchukuttan
opened
9 months ago
1
L3Cube-MahaSocialNER: A Social Media based Marathi NER Dataset and BERT models
#246
anoopkunchukuttan
opened
9 months ago
0
MassiveSum Summarization dataset
#245
anoopkunchukuttan
closed
9 months ago
0
Chandamama Kathalu
#244
anoopkunchukuttan
opened
9 months ago
0
Large scale Document level corpora
#243
anoopkunchukuttan
opened
9 months ago
0
Multilingual Bias Detection and Mitigation for Indian Languages
#242
anoopkunchukuttan
opened
9 months ago
0
WMT23 QE datasets
#241
anoopkunchukuttan
closed
9 months ago
0
Mukhyansh: A Headline Generation Dataset for Indic Languages
#240
anoopkunchukuttan
opened
9 months ago
0
CoPara: The First Dravidian Paragraph-level n-way Aligned Corpus
#239
anoopkunchukuttan
closed
9 months ago
0
Languages from Northeast India - Sino-Tibetan
#238
maharajbrahma
opened
9 months ago
0
KHASI north east
#237
dame-cell
opened
9 months ago
2
Update README.md
#236
ritwikmishra
closed
9 months ago
0
updated some URLs
#235
stranak
opened
1 year ago
0
REDFM: Relation Extraction
#234
anoopkunchukuttan
opened
1 year ago
0
X-RiSAWOZ: Dialogue dataset
#233
anoopkunchukuttan
opened
1 year ago
0
Vacaspati: A Diverse Corpus of Bangla Literature
#232
anoopkunchukuttan
opened
1 year ago
1
FIX: Readme Update for Catalog Index
#231
suyash-srivastava-dev
opened
1 year ago
0
Hindi-telugu parallel corpus
#230
anoopkunchukuttan
opened
1 year ago
0
HindiMD: Hindi Sentiment Analysis
#229
anoopkunchukuttan
opened
1 year ago
0
Indian Healthcare Query Intent
#228
anoopkunchukuttan
opened
1 year ago
0
Hindi political bias dataset
#227
anoopkunchukuttan
opened
1 year ago
0
Telugu datasets
#226
anoopkunchukuttan
opened
1 year ago
0
TeQuad
#225
anoopkunchukuttan
opened
1 year ago
0
TeSum: Telugu Abstractive Summarization
#224
anoopkunchukuttan
closed
9 months ago
5
Bilingual Tabular Inference: A Case Study on Indic Languages
#223
anoopkunchukuttan
opened
1 year ago
0
Hindi Arithmetic Problems
#222
anoopkunchukuttan
opened
1 year ago
0
UGIF: UI Grounded Instruction Following
#221
anoopkunchukuttan
opened
1 year ago
0
Bangla & summarization Resources from BUET
#220
anoopkunchukuttan
opened
1 year ago
0
L3Cube-IndicSBERT
#219
anoopkunchukuttan
opened
1 year ago
0
Glot500: Scaling multilingual corpora and language models to 500 languages.
#218
anoopkunchukuttan
opened
1 year ago
0
Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages
#217
anoopkunchukuttan
opened
1 year ago
0
Resource updates from AI4Bharat: May 2023
#216
anoopkunchukuttan
opened
1 year ago
0
PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India
#215
anoopkunchukuttan
opened
1 year ago
1
English-Marathi post-edited MT dataset
#214
GokulNC
opened
1 year ago
0
HinGE: A Dataset for Generation and Evaluation of Code-Mixed Hinglish Text
#213
GokulNC
opened
1 year ago
0
Redundant KMI Linguistics Bodo
#212
maharajbrahma
closed
1 year ago
1
Anuvaad Indic-OCR models
#211
GokulNC
opened
1 year ago
0
NTREX -- News Test References for MT Evaluation
#210
GokulNC
opened
1 year ago
0
fix typo in README
#209
vipranarayan14
closed
1 year ago
0
BMM Corpus
#208
niyatibafna
opened
1 year ago
0
GSM8K dataset in Indian languages
#207
anoopkunchukuttan
opened
1 year ago
0
Annotated Speech Corpus for Low Resource Indo-Aryan Languages
#206
GokulNC
opened
2 years ago
0
Could not download IndicFT models
#205
Surya291
opened
2 years ago
0
InLegalBERT/InCaseLawBERT: Pre-trained LMs for Indian law corpus
#204
anoopkunchukuttan
opened
2 years ago
0
Next