keras-team / keras-nlp

Modular Natural Language Processing workflows with Keras
Apache License 2.0
758 stars 227 forks source link

Data-Parallel Training with KerasNLP and tf.distribute example dataset problem #1504

Open sitamgithub-MSIT opened 5 months ago

sitamgithub-MSIT commented 5 months ago

Describe the bug Data-Parallel Training with KerasNLP and tf.distribute This is an example using a dataset that shows 403: Forbidden. Giving the message "Access Denied.".

To Reproduce Provide a link to a Colab Notebook, which reproduces the bug.

Expected behavior The expected behavior is that the dataset should be downloaded properly so that an example can be run and an error like "Access Denied" should not appear.

Additional context

Would you like to help us fix it?

Yes, if I can help in any way, I will be glad to do so!

SuryanarayanaY commented 5 months ago

Hi @sitamgithub-MSIT ,

I think the link https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip not working right ? Please feel free to fix the issue if you want to contribute. Thanks!

sitamgithub-MSIT commented 5 months ago

Hi @sitamgithub-MSIT ,

I think the link https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip not working right ? Please feel free to fix the issue if you want to contribute. Thanks!

Yeah, the link is not working, saying "Access Denied." Now that this is a dataset link issue, I don't think I can do much. Better will be tagging the author; surely we can get a solution then.

sitamgithub-MSIT commented 5 months ago

@shivance Can you please help with this?

SamanehSaadat commented 5 months ago

Hi @sitamgithub-MSIT

Thanks for reporting this issue. As discussed here it seems that the dataset link doesn't work anymore. I think this dataset is available on Hugging Face too: https://huggingface.co/datasets/wikitext I'll update the colab with the Hugging Face dataset. The Hugging Face wikitext-2-v1 dataset seems to be a different format though so I should spend some time to figure this out.