allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.72k stars 229 forks source link

Problems Following aws s3 downloading #382 #386

Closed CharlesQ9 closed 3 years ago

CharlesQ9 commented 3 years ago

_Do you have this issue for all files (e.g. try just this one s3://ai2-s2-scispacy/data/umls_semantic_typetree.tsv)? Unfortunately I cannot make the ontonotes file public due to licensing, but the others should be publicly readable. Let me know if they are not and I'll investigate which AWS permissions I may have set incorrectly. Thanks!

This one works for me. I try again and now ''' aws s3 cp s3://ai2-s2-scispacy/data/ud_ontonotes.tar.gz assets/ud_ontonotes.tar.gz tar -xzvf assets/ud_ontonotes.tar.gz -C assets/ rm assets/ud_ontonotes.tar.gz ############################################################# aws s3 cp s3://ai2-s2-scispacy/data/ner/ assets --recursive --exclude '' --include '.tsv' ''' These two still doesn't work. The error information is in the following. ''' $ aws s3 cp s3://ai2-s2-scispacy/data/ud_ontonotes.tar.gz assets/ud_ontonotes.tar.gz fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden $ aws s3 cp s3://ai2-s2-scispacy/data/ner/ assets --recursive --exclude '' --include '.tsv' fatal error: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied ''' I am wondering what kind of licensing do I need? Is that UMLS Terminology Services (UTS) (Link: [https://uts.nlm.nih.gov/uts/])? I already have one for this. If some of the files cannot be made public, can you show the details how to prepare it?(For example, the data format(one line of example), where to download the raw file(I can download by myself if I can apply for that licence), then how to convert the data into ud_ontonotes.tar.gz) Thank you so much!!!

dakinggg commented 3 years ago

Closing because you have another issue with the same thing open.