AI4Bharat / Indic-BERT-v1

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
https://indicnlp.ai4bharat.org
MIT License
276 stars 41 forks source link

Error during downloading the en-indic dataset #21

Closed Subhashree-Tripathy closed 3 years ago

Subhashree-Tripathy commented 3 years ago

image Getting the above error while trying to download the en-indic dataset.

pranavraikote commented 3 years ago

Even I'm getting the same error. Requesting the authors to kindly resolve this. Very excited to try indic-bert. @divkakwani

gowtham1997 commented 3 years ago

Hey guys,

Sorry for this.

We have an issue with GCP bucket links and it'll most likely be resolved next week.

@Subhashree-Tripathy What is the dataset you were trying to download? Can you paste the link here? Will try to check if we have a backup.

@pranavraikote For using the indicbert model, can you try using the model from huggingface:

from transformers import AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('ai4bharat/indic-bert')
model = AutoModel.from_pretrained('ai4bharat/indic-bert')
Subhashree-Tripathy commented 3 years ago

I was having an issue with the above link, I checked and now it's working https://indicnlp.ai4bharat.org/samanantar/#downloads