LiamSingh64 commented 1 year ago

FireStore is NoSQL, so it uses a Document-Collections model. Personally, I only have experience with SQL so I made a quick ERD of the database design just to get the ball rolling of how we are going to plan this out. This obviously WON'T be the final design, as I said, just a quick start to properly develop the design. Any ideas on how to improve this??

D4ni3l8 commented 1 year ago

In NoSQL, i found out that primary keys and foreign keys do not exist. And I think that the relationship between users and heart data is one-to-one because a user can have only one set of data.

This might still need improvements.

rwx-yxu commented 1 year ago

The users should be able to post more than one heart data row. In stead of one to one, it makes more sense for it to be one user having a heart data collection which can be one or more. Not sure about putting posts in its own DB with the proposed columns since content is what exactly? The post data should be associated with the Heart data.

If the user does not have a heart data row, then just leave it unmodifiable.

AymanReh commented 1 year ago

Did some research. No SQL databases use 1 large table instead of a bunch of smaller. And data is specified using unique keys and key value pairs. But not 100% sure how that all works. Haven't done it before

rwx-yxu commented 1 year ago

Useful to watch this: https://www.youtube.com/watch?v=jm66TSlVtcc . Seems to me we just need to put the user id in the heart data table and whenever we need just the heart data posts, we query for that user id that we will have for the user session. Different from Relational DBs where we would start with the user table and do a join.

AymanReh commented 1 year ago

Untitled Diagram drawio

Very rough ERD I made for No-SQL.

advweb-grp1 commented 1 year ago

What data do we currently have?

We will list out all of the fields in the CSV to gain an understanding of what we have.

High Priority Fields

Name: ledv (left ventricular end diastolic volume)
- Datatype: decimal or float
- Description: The volume that is stored in the left ventricular end diastolic
Name: lesv (left ventricular end systolic volume)
- Datatype: decimal or float
- Description: The volume that is stored in the left ventricular end systolic
Name: lsv (left systolic volume )
- Datatype: decimal or float
- Description: The volume that is stored in the left ventricular systolic

D4ni3l8 commented 1 year ago

Other High Priority Fields

Name: Gender
- Datatype: String
- Description: Describes patient's gender/sex
Name: Age at MRI (Magnetic resonance imaging)
- Datatype: Integer
- Descrption: Patient's age when the MRI was conducted?
Name: Apical HCM (Apical hypertrophic cardiomyopathy)
- Datatype: Integer/Boolean (unsure whether this could be a boolean as values in the csv table are 0s and 1s )
- Description: Rare form of hypertrophic cardiomyopathy (HCM) which usually involves the apex of the left ventricle and rarely involves the right ventricular apex or both

API Research

My API was more of a catalog of articles and journals. But i found out that the diseases were categorised in Gene/Locus, Phenotype, Inheritance.

AymanReh commented 1 year ago

High Priority Fields

Name: lvmass/rvmass (left/right ventricular mass) Datatype: decimal/float Description: lvmass is the weight of the left ventricle in the heart / rvmass is the weight of the right ventricle in the heart

Name: LSV/RSV (left/right systolic volume) Datatype: decimal/float Description: The volume of blood at the left/right ventricle at the end of the systolic ejection.

Name: fibrosis/scarring (scar) Datatype: Integer Description: When scars are present on the heart caused by a myocardial infarction

Mutations

Name: ACTC Datatype: Integer Description: Mutation involving the development of hypertrophic cardiomyopathy

Name:TPM1 Datatype: Integer Description: Having this mutation means there is an increased risk of dilated cardiomyopathy

Name: TNNCI Datatype: Integer Description: This mutation interferes with the part of the heart which is in charge of protein encoding

API RESEARCH

HPO stands for Human Phenotype Ontology. This API contains a repository of abnormalities found inside of diseases including the human heart. This API could be useful in our cardiomyopathy websites due to there being a variety of different parameters which can be sent to the API and the data returned will be relevant to cardiomyopathy. The parameter hpoId allows a string to be passed to it and intern details of that specfic string are sent back to the user. E.g. if the string HP:0001166 is sent to the hpoId, the expected return would be "A valid human phenotype ontology identifier". Another useful parameter would be hpoSearch parameter which when sent a string that contains the name of a disease or gene it would return details on the disease/gene. For example, if the user sends "q=arach", the returned outcome would be "Could be one of identifier (HPO, OMIM, ORPHA), an entrez id, or text.

ERD

erd For mutations i wasnt sure weather to make them integer or boolean, i chose boolean since its either they have the mutation or they dont

LiamSingh64 commented 1 year ago

High Priority Fields

Name: Sudden Cardiac Arrest Data Type: Boolean or char(1/0) NULLABLE Description: True if a patient suddenly died due to cardiac arrest
Name: Hyper Tension Data Type: Boolean or char(1/0) Description: High Blood Pressure
Name: Diabetes Data Type: Boolean or char(1/0) Description: Blood sugar levels are too high
Name: Myectomy Data Type: Boolean or char(1/0) Description: SURGERY for cardiomyopathy which involves removing the thickened heart wall.

Gene Mutations

Name: TNNCI Data Type: Boolean or char(1/0) Description: Gene that makes a type of cardiac muscle, cardiac troponin 1. Cardiac troponin 1 is one of 3 proteins that form the full troponin protein complex.
Name: MYL2 Data Type: Boolean or char(1/0) Description: helps determine heart muscle structure/function, regulates how much myosin moves/bends and its function.
Name: TTN Data Type: Boolean or char(1/0) Description: helps make different kinds of protein called Titin which are found in skeletal and cardiac muscle.

API Research (NBCI SNP)

Honestly, all these APIs look pretty crap. This API can take names as to return it's data but in a very strange format, doesn't really return data the spec say we need (Just returns ID's, NO USEFUL DATA that the spec asks for), and it's returned in XML. I had a quick look at the APIs he suggested and I think we should probably just look for an entirely different one.

D4ni3l8 commented 1 year ago

advweb-grp1 / advanced-web-final-year-project

Database Design #2

What data do we currently have?

High Priority Fields

Other High Priority Fields

API Research

High Priority Fields

Mutations

API RESEARCH

ERD

High Priority Fields

Gene Mutations

API Research (NBCI SNP)