Closed aflah02 closed 2 years ago
Nvm it turns out I only needed to replace it like this -
from datasets import load_dataset
TG_data = load_dataset("skg/toxigen-data", name="train", use_auth_token=Actual_Token) # 250k training examples
TG_annotations = load_dataset("skg/toxigen-data", name="annotated", use_auth_token=Actual_Token) # Human study
@wzhings It's a Hugging Face Auth token. You can find how to get one here - https://huggingface.co/docs/hub/security-tokens
@aflah02 Thank you for your reply. I created the auth_tokens, but I still got the following error
HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/datasets/skg/toxigen-data
I think I need to obtain a permission by filling the form, and then accessing the data.
Hi @wzhings, are you plugging in the security token when loading the data? Up above, it seems you can do it this way:
Actual_Token = "<YOUR_TOKEN_GOES_HERE>"
TG_data = load_dataset("skg/toxigen-data", name="train", use_auth_token=Actual_Token) # 250k training examples
TG_annotations = load_dataset("skg/toxigen-data", name="annotated", use_auth_token=Actual_Token)
where Actual_Token is the token you got from the security-tokens page.
I personally didn't use this method, though, I used huggingface_cli. According to this page I think you can try:
pip install huggingface_hub
from command line, then:
from huggingface_hub import notebook_login
notebook_login()
within python
Hi @Thartvigsen, thank you for your information. I used the first method (i.e., Acutal_Token) and got the above error. Now I will try the second method you used. Thank you :)
@wzhings Do you still get the error after filling the form? That's strange because this worked for me lol
Hi @aflah02, Yes, I still get the error after filling the form. I did not get any response after filling the form. I am not sure whether I need to wait for their permission.
Hey @wzhings I'm not sure if you need to wait for the permission but this is quite strange 🤔, I had also filled the form and then generated the token and it worked. I guess it could be the order maybe? Generating the tokens after filling the form? or maybe has to do with permissions only! Anyways if what @Thartvigsen suggested works you could just ignore all this that seems to be the better way
@wzhings you won't get a response after filling out the form, no need to wait on that to get access!
Hello @Thartvigsen, I finally can access the dataset with the two above methods after filling the forms with different email accounts. Thank you.
@wzhings I am glad to hear that, thanks for letting me know!
Hey! Awesome Paper and codebase, it's very well documented!! I've ben facing some issues trying to load the dataset, I tried to load it on colab using the following lines -
I got the following error -
I suspect it's because of some authorization issues but I've filled the form and not quite sure what else should I do?