Justin asked on May 14 to study the feasibility of implementing a proof of concept (POC) by end of Phase 3 (end of June, 6 weeks from now minus one where I'll be on vacation) as a new deliverable. The POC would be used to showcase the following features:
Evaluate MCW and Stanford University program on 2014 i2b2 de-id challenge dataset
New! Evaluate quality of dataset annotation using one NLP de-id service.
Today we do not have the infrastructure to evaluate services, only programs. Implementing an infrastructure to evaluate services is out of the scope for a POC with only 5 weeks available.
Standford method is not available as a service (TODO: to verify). At least according to Justin, the Standford program is already compatible with the data format used in the 2014 i2b2 challenge.
Item 2
Need an NLP de-id method to run as a service (MCW)
Use modified 2014 i2b2 annotations as submission files for testing
Proposed solution
Item 1
Use Sage/DREAM technology (Synapse, workflow) to setup a "challenge" that use 2014 i2b2 challenge dataset.
Submission format: Dockerized MCW and Standford NLP de-id programs
Return the score used during the 2014 i2b2 challenge
Item 2
Run MCW as a service on a server
Submission format:
Raw clinical note
Annotation of clinical notes using 2014 i2b2 gold standard format (file)
The submission is sent to the MCW service, getting back MCW prediction
Compare submitted annotation (prediction) to MCW output (gold standard)
Score: Same as for Item 1, but this time the output of MCW service is used as the gold standard
Justin asked on May 14 to study the feasibility of implementing a proof of concept (POC) by end of Phase 3 (end of June, 6 weeks from now minus one where I'll be on vacation) as a new deliverable. The POC would be used to showcase the following features:
Observations
Item 1
Item 2
Proposed solution
Item 1
Item 2