titipata / detecting-scientific-claim

Extracting scientific claims from biomedical abstracts (powered by AllenNLP)
140 stars 20 forks source link

404 error - "https://s3-us-west-2.amazonaws.com/pubmed-rct/model_crf.tar.gz" #25

Open itsmemala opened 4 years ago

itsmemala commented 4 years ago

The model link results in a 404 error. Has it been moved to a different bucket or not hosted any longer?

titipata commented 4 years ago

Hi @itsmemala, yes, it seems like the host who I host the data removed it. I will try to put it up quite soon early next month!

itsmemala commented 4 years ago

Thanks!

laviniaflorentina commented 4 years ago

I'm also looking for this to be solved

laviniaflorentina commented 4 years ago

Links that need fixing:

https://s3-us-west-2.amazonaws.com/pubmed-rct/train.json
https://s3-us-west-2.amazonaws.com/pubmed-rct/dev.json
https://s3-us-west-2.amazonaws.com/pubmed-rct/test.txt
https://s3-us-west-2.amazonaws.com/pubmed-rct/model.tar.gz

Thank you 🙏

titipata commented 4 years ago

@laviniaflorentina thanks so much for the notice! @daniel-acuna Can I poke here if you store the deleted S3 somewhere?

titipata commented 4 years ago

@laviniaflorentina @itsmemala I put on temporary model paths here: https://github.com/titipata/detecting-scientific-claim/blob/master/main.py#L37-L38. You can now run it. For train.json, dev.json and test.txt, I will update it later. In gist, it's a post-process file of dataset folder.

Shiyun-W commented 1 year ago

Hi, I am facing this problem as well. I would like to ask if there is some way to solve it? Thank you very much if somebody could help me to solve it!

titipata commented 1 year ago

Hi @Shiyun-W, unfortunately the model checkpoint was deleted. I might have to check if it's somewhere on my computer. In addition, the code is outdated with the AllenNLP old version.

vibhor98 commented 1 year ago

Hi @titipata, using the provided S3 bucket links, I am also not able to access the model and the annotated dataset. Can you please share the dataset of the annotated labels (claims and non-claims) for PubMedRCT dataset? Assuming sharing this small dataset is easier than the model weights? Thank you!

titipata commented 1 year ago

@vibhor98 yes, the dataset is available here https://github.com/titipata/detecting-scientific-claim/tree/master/dataset. I couldn't find the trained model since the bucket was deleted. I hope the provided notebook is sufficient for training the model. This codebase is kinda outdated.