Medical centres and clinics collect abundant records on patients that can be applied to health and medical research challenges. This is certainly the case with CFS Discovery in Melbourne, who have collected detailed data on ME/CFS patients for almost 20 years. ME/CFS is a mysterious and controversial disease, where Australian scientists are making excellent progress via cross-disciplinary research.
Patient data is stored as individual records, and as such it is very time consuming to physically extract the data for spreadsheets and databases, required for machine-learning and other analyses.
The project is: To automate the extraction of valuable ME/CFS patient data into aggregated form (via spreadsheets is fine), accurately and efficiently. The successful completion of this challenge will accelerate our research efforts towards understanding ME/CFS, by contributing to the construction of anonymous patient databases for pattern recognition interrogation, as well as in support for future biobanks.
Release | Date | Iterations | User Stories | Deliverable |
---|---|---|---|---|
1 | 10/09/2019 | 1,2 | 1-4,8-11 | Conversion, Upload and draft GUI |
2 | 08/10/2019 | 3,4 | 5,6,7,12,13 | Search, download excel and final GUI |
Week 2-3: Requirement gathering and technology assessment/exploration.
Week 4-5: Work on printed text conversion to raw text for multiple record types.
Week 6-Teaching Break: WebApp development and porting of components.
Teaching Break-Week 7: WebApp development and porting of components.
Week 8-9: Code retrofitting and refactoring.
Week 10-11: Full round of user acceptance testing and consolidation.
Week 12: Reflection and improvements.
• Modular well-structured code to carry out the following:
o Implement process to convert and store printed patient records from image documents.
o Build a usable web application for multiple users.
o Provide an easy to use and intuitive user interface.
o Allow sufficient space for scalability and further analytic needs.
• Documentation will be covered in the following:
o Well commented and modular code.
o High level documentation on code process in each iteration.
o A user guide to be handed to the stakeholders at the end of the project.
Work is being tracked through 2 levels:
Member | UID | Role | Backup | |
---|---|---|---|---|
You Li | u6430173 | Developer | Quality Assurance | you.li@anu.edu.au |
Nigel Tee | u6530834 | Developer | Quality Assurance | nigel.tee@anu.edu.au |
Chin Hun Young(Spokesperson) | u6530822 | Quality Assurance | Project Manager | chin.young@anu.edu.au |
Rufus Raja | u6275198 | Project Manager | Developer | rufus.raja@anu.edu.au |
Supervisor and product owner: Brett Lidbury
Team meets every Monday & Saturday for working together sessions, supporting pair programming, and project issue resolution.
Decision Making Document
Feedback Log
*Github is open to public for now, will set it to private once we start working on client sensitive materials
Name | Role |
---|---|
Dr. Brett Lidbury | Primary Client |
Dr. Alice Richardson | Secondary Client |
Huy Pham | Tutor |
All communication via email/skype.
Signed Statement of Work
Approved DB Fields
Iteration Tracking
WebApp Project wireframe
Webapp Code Repository
Issue Tracking
Testing Summaries
• Off the shelf tools
○ Optical Character Recognition software
○ PyCharm
○ AWS Database
• implementation from scratch
○ Neural Network
○ Code in Tesseract/TensorFlow
• Constraints:
○ Documents containing handwritten and printed text.
○ Sensitive issue related to disclosing patients' data
Likelihood: Unlikely
Consequence: Catastrophic
Priority: High
Solutions: Obtain main stakeholder’s agreement on the time required for conversion to match expectation
Likelihood: Possible
Consequence: Moderate
Priority: High
Solutions: Obtain main stakeholder’s agreement on the possible removal of feature
Likelihood: Certain
Consequence: Major
Priority: High
Solutions: Clear scope of work agreed upon by the stakeholder, continuous analysis
Likelihood: Possible
Consequence: Catastrophic
Priority: Extreme
Solutions: Access authorisation, multilevel security model in databases & encryption
Likelihood: Possible
Consequence: Catastrophic
Priority: Extreme
Solutions: Signed Ethical form, Downgrading results, stakeholder’s agreement on storage method
Likelihood: Possible
Consequence: Minor
Priority: Medium
Solutions: Shadowing teammates, Daily stand-up to update on progress
Likelihood: Rare
Consequence: Moderate
Priority: Low
Solutions: Shadowing teammates, Daily stand-up to update on progress