Closed caufieldjh closed 4 years ago
SHARPn NLP Seed Corpus - paper - clinical notes from pulmonary arterial disease and breast cancer patients.
That paper may not be quite right.
SHARPn Stratified Corpus - this and the previous set seem like they should be at http://informatics.mayo.edu/sharp/index.php/Tools but I haven't found download links for the corpora.
Would like to add more corpora but don't have stable links for them yet. The first one is MiPACQ: paper - the Multi-source Integrated Platform for Answering Clinical Questions corpus, containing 13,091 sentences from clinical narratives, all annotated for syntactic structure and named entities.
Looks like it should be at http://clear.colorado.edu/compsem/index.php?page=endendsystems&sub=mipacq but that link doesn't appear live at the moment.
The Colorado CLEAR page is accessible but doesn't have an obvious link to the MiPACQ corpus. Looks like usage may still require coordination through Mayo Clinic?
Gave up looking for these - inaccessible data sets are Not Awesome.
Hi, @caufieldjh. It's a shame that it's not easier to access these. Did you ever get access? I'm also trying to get access to MiPACQ and SHARP (and THYME, too).
Hi @drussellmrichie - not sure about the others, but THYME colon cancer splits are here: https://github.com/stylerw/thymedata
Not sure if you're still interested in this, but for you or anyone else who comes across this, according to an email that my PI just received from Guergana Savova, who co-leads hNLP:
"the MiPACQ and SHARP corpora are not available for distribution at this point."
😦😦😦😦😦😦😦😦
I'll post here if I here anything else....
Oh well! Thanks for forwarding the official word, even if it's disappointing.
Would like to add more corpora but don't have stable links for them yet. The first one is MiPACQ: paper - the Multi-source Integrated Platform for Answering Clinical Questions corpus, containing 13,091 sentences from clinical narratives, all annotated for syntactic structure and named entities.
Looks like it should be at http://clear.colorado.edu/compsem/index.php?page=endendsystems&sub=mipacq but that link doesn't appear live at the moment.