The purpose of this research project is to predict diseases using several sources of data, such as laboratory tests and ECGs. The model should be flexible and modular, so that its parts can be reused even when all data sources are not available.
The experiments conducted to reach this goal will rely on the MIPHA framework. Its main features are as follows:
[A] Flexible framework allowing for the study of any disease
[B] Ability to include data from various sources
[C] Modular architecture designed for reusability
The framework allows for easy implementation and integration. By offering a structure and conventions, it reduces the amount of code that needs to be rewritten from scratch. It opens up a transdisciplinary avenue of research for collaborative, open-source predictive medicine (similarly to what exists with image prediction).
In the future, the framework's modular approach could even allow for improved explainability.
Desired Outcome
Empirically demonstrate the value of the MIPHA framework for disease prediction.
We want to answer the following questions:
[A] Regarding disease prediction
[ ] #10
[ ] #11
[ ] #29
[B] Regarding the use of several sources for disease prediction
[ ] #12
[ ] #13
[ ] #14
[C] Regarding the modularity of MIPHA
[ ] #15
[ ] #17
[ ] #19
Current State
Currently, our models are able to predict stage 4/5 chronic kidney disease up to a year prior using a year of biological history. We have identified the following possible improvements:
Generalize the model to other diseases
Facilitate the retraining of the model for other use cases (e.g. when the available data is different)
Diseases studied
Chronic Kidney Disease (going from stage 2/3 to stage 4/5, or going to stage 4/5 for patients with type II diabetes)
Onset of type II diabetes (usually for people over 40 years old)
Myocardial infarction
Success Criteria
[ ] The model is able to predict diseases with satisfying accuracy (this needs to be defined more precisely, but targetting over 85% recall with over 75% precision is a decent rule of thumb)
[ ] Retraining the modular model is faster than retraining a full machine learning model, with minimal loss of performance
[ ] The addition of data sources increases performance. In the context of this issue, the three data sources we will consider are: demographics (age/sex), laboratory results, and ECGs. If ECGs do not provide good results, we can consider using certain diagnoses instead.
Impact
The model should allow for faster iterations in research, and increase the overall performance of disease prediction models.
The model's high transferability and flexibility would also open perspectives for open-source, collaborative machine learning models for healthcare.
Metrics
Accuracy
Precision
Recall
F1-Score
Matthews correlation coefficient
Research should also be conducted on measuring the transferrability of models.
Problem Statement
The purpose of this research project is to predict diseases using several sources of data, such as laboratory tests and ECGs. The model should be flexible and modular, so that its parts can be reused even when all data sources are not available.
The experiments conducted to reach this goal will rely on the MIPHA framework. Its main features are as follows:
The framework allows for easy implementation and integration. By offering a structure and conventions, it reduces the amount of code that needs to be rewritten from scratch. It opens up a transdisciplinary avenue of research for collaborative, open-source predictive medicine (similarly to what exists with image prediction).
In the future, the framework's modular approach could even allow for improved explainability.
Desired Outcome
Empirically demonstrate the value of the MIPHA framework for disease prediction. We want to answer the following questions:
Current State
Currently, our models are able to predict stage 4/5 chronic kidney disease up to a year prior using a year of biological history. We have identified the following possible improvements:
Diseases studied
Success Criteria
Impact
The model should allow for faster iterations in research, and increase the overall performance of disease prediction models.
The model's high transferability and flexibility would also open perspectives for open-source, collaborative machine learning models for healthcare.
Metrics
Research should also be conducted on measuring the transferrability of models.
Constraints
[None]
Solution Architecture
Solution Architecture Description
Data Requirements
Datasets
1
2
Understanding and Exploration
Approaches and Experiments
Future improvements
See Also
Background Info
Related Projects