rfordatascience / tidytuesday

Official repo for the #tidytuesday project
Creative Commons Zero v1.0 Universal
6.7k stars 2.38k forks source link

New Data: 100 patient histories and their 1-year predicted risks of multiple outcomes #629

Closed jreps closed 10 months ago

jreps commented 10 months ago

Data dictionary:

variable class description
personId integer A unique identifier for the simulated patient
age.group...10....14 integer A binary column where 1 means the patient is aged between 10-14 (inclusive) and 0 means the patient is not in that age group
age.group...15....19 integer A binary column where 1 means the patient is aged between 15-19 (inclusive) and 0 means the patient is not in that age group
age.group...20....24 integer A binary column where 1 means the patient is aged between 20-24 (inclusive) and 0 means the patient is not in that age group
age.group...65....69 integer A binary column where 1 means the patient is aged between 65-69 (inclusive) and 0 means the patient is not in that age group
age.group...40....44 integer A binary column where 1 means the patient is aged between 40-44 (inclusive) and 0 means the patient is not in that age group
age.group...45....49 integer A binary column where 1 means the patient is aged between 45-49 (inclusive) and 0 means the patient is not in that age group
age.group...55....59 integer A binary column where 1 means the patient is aged between 55-59 (inclusive) and 0 means the patient is not in that age group
age.group...85....89 integer A binary column where 1 means the patient is aged between 85-89 (inclusive) and 0 means the patient is not in that age group
age.group...75....79 integer A binary column where 1 means the patient is aged between 75-79 (inclusive) and 0 means the patient is not in that age group
age.group....5.....9 integer A binary column where 1 means the patient is aged between 5-9 (inclusive) and 0 means the patient is not in that age group
age.group...25....29 integer A binary column where 1 means the patient is aged between 25-29 (inclusive) and 0 means the patient is not in that age group
age.group....0.....4 integer A binary column where 1 means the patient is aged between 0-4 (inclusive) and 0 means the patient is not in that age group
age.group...70....74 integer A binary column where 1 means the patient is aged between 70-74 (inclusive) and 0 means the patient is not in that age group
age.group...50....54 integer A binary column where 1 means the patient is aged between 50-54 (inclusive) and 0 means the patient is not in that age group
age.group...60....64 integer A binary column where 1 means the patient is aged between 60-64 (inclusive) and 0 means the patient is not in that age group
age.group...35....39 integer A binary column where 1 means the patient is aged between 35-39 (inclusive) and 0 means the patient is not in that age group
age.group...30....34 integer A binary column where 1 means the patient is aged between 30-34 (inclusive) and 0 means the patient is not in that age group
age.group...80....84 integer A binary column where 1 means the patient is aged between 80-84 (inclusive) and 0 means the patient is not in that age group
age.group...90....94 integer A binary column where 1 means the patient is aged between 90-94 (inclusive) and 0 means the patient is not in that age group
sex...FEMALE integer A binary column where 1 means the patient has a female sex
sex...MALE integer A binary column where 1 means the patient has a male sex
Acetaminophen.exposures.in.prior.year integer A binary column where 1 means the patient had a record for acetaminophen in the prior year and 0 means they did not
Occurrence.of.Alcoholism.in.prior.year integer A binary column where 1 means the patient had a record for alcoholism in the prior year and 0 means they did not
Anemia.in.prior.year integer A binary column where 1 means the patient had a record for anemia in the prior year and 0 means they did not
Angina.events.in.prior.year integer A binary column where 1 means the patient had a record for angina in the prior year and 0 means they did not
ANTIEPILEPTICS.in.prior.year integer A binary column where 1 means the patient had a record for a drug in the category ANTIEPILEPTICS in the prior year and 0 means they did not
Occurrence.of.Anxiety.in.prior.year integer A binary column where 1 means the patient had a record for anxiety in the prior year and 0 means they did not
Osteoarthritis.in.prior.year integer A binary column where 1 means the patient had a record for osteoarthritis in the prior year and 0 means they did not
Aspirin.exposures.in.prior.year integer A binary column where 1 means the patient had a record for aspirin in the prior year and 0 means they did not
Occurrence.of.Asthma.in.prior.year integer A binary column where 1 means the patient had a record for asthma in the prior year and 0 means they did not
Atrial.Fibrillation..incident.in.prior.year integer A binary column where 1 means the patient had a record for atrial fibrillation in the prior year and 0 means they did not
HORMONAL.CONTRACEPTIVES.in.prior.year integer A binary column where 1 means the patient had a record for a drug in the category hormonal contraceptives in the prior year and 0 means they did not
Any.cancer..excl..prostate.cancer.and.benign.cancer..in.prior.year integer A binary column where 1 means the patient had a record for cancer excluding prostate and benign cancers in the prior year and 0 means they did not
Acute.Kidney.Injury..AKI..in.prior.year integer A binary column where 1 means the patient had a record for acute kidney injury in the prior year and 0 means they did not
Chronic.kidney.disease.or.end.stage.renal.disease.in.prior.year integer A binary column where 1 means the patient had a record for chronic kidney disease in the prior year and 0 means they did not
Heart.failure.in.prior.year integer A binary column where 1 means the patient had a record for heart failure in the prior year and 0 means they did not
Chronic.obstructive.pulmonary.disease..COPD..in.prior.year integer A binary column where 1 means the patient had a record for chronic obstructive pulmonary disease in the prior year and 0 means they did not
Coronary.artery.disease..CAD..in.prior.year integer A binary column where 1 means the patient had a record for coronary artery disease in the prior year and 0 means they did not
Major.depressive.disorder..with.NO.occurrence.of.certain.psychiatric.disorder.in.prior.year integer A binary column where 1 means the patient had a record for major depressive disorder and no certain psychiatric disorders in the prior year and 0 means they did not
Type.1.diabetes.and.no.prior.specific.non.T1DM.diabetes.in.prior.year integer A binary column where 1 means the patient had a record for type 1 diabetes in the prior year and 0 means they did not
Type.2.Diabetes.Mellitus..DM...with.no.type.1.or.secondary.DM.in.prior.year integer A binary column where 1 means the patient had a record for type 2 diabetes in the prior year and 0 means they did not
Deep.Vein.Thrombosis..DVT..in.prior.year integer A binary column where 1 means the patient had a record for deep vein thrombosis in the prior year and 0 means they did not
Dyspnea.in.prior.year integer A binary column where 1 means the patient had a record for dyspnea in the prior year and 0 means they did not
Edema.in.prior.year integer A binary column where 1 means the patient had a record for edema in the prior year and 0 means they did not
Gastroesophageal.reflux.disease.in.prior.year integer A binary column where 1 means the patient had a record for gastroesophageal reflux in the prior year and 0 means they did not
Acute.gastrointestinal..GI..bleeding.in.prior.year integer A binary column where 1 means the patient had a record for acute gastrointestinal bleeding in the prior year and 0 means they did not
Heart.valve.disorder.in.prior.year integer A binary column where 1 means the patient had a record for heart valve disorder in the prior year and 0 means they did not
Chronic.hepatitis.in.prior.year integer A binary column where 1 means the patient had a record for chronic hepatitis in the prior year and 0 means they did not
Hyperlipidemia.in.prior.year integer A binary column where 1 means the patient had a record for hyperlipidemia in the prior year and 0 means they did not
Hypertension.in.prior.year integer A binary column where 1 means the patient had a record for hypertension in the prior year and 0 means they did not
Hypothyroidism.in.prior.year integer A binary column where 1 means the patient had a record for hypothyroidism in the prior year and 0 means they did not
Inflammatory.Bowel.Disease.in.prior.year integer A binary column where 1 means the patient had a record for inflammatory bowel disease in the prior year and 0 means they did not
Low.back.pain.in.prior.year integer A binary column where 1 means the patient had a record for low back pain in the prior year and 0 means they did not
Occurrence.of.neuropathy.in.prior.year integer A binary column where 1 means the patient had a record for neuropathy in the prior year and 0 means they did not
Obesity.in.prior.year integer A binary column where 1 means the patient had a record for obesity in the prior year and 0 means they did not
Opioids.in.prior.year integer A binary column where 1 means the patient had a record for an opioid in the prior year and 0 means they did not
Osteoporosis.in.prior.year integer A binary column where 1 means the patient had a record for osteoporosis in the prior year and 0 means they did not
Peripheral.vascular.disease.in.prior.year integer A binary column where 1 means the patient had a record for peripheral vascular disease in the prior year and 0 means they did not
Pneumonia.in.prior.year integer A binary column where 1 means the patient had a record for pneumonia in the prior year and 0 means they did not
Psychotic.disorder.in.prior.year integer A binary column where 1 means the patient had a record for a psychotic disorder in the prior year and 0 means they did not
Acute.Respiratory.failure.in.prior.year integer A binary column where 1 means the patient had a record for acute respiratory failure in the prior year and 0 means they did not
Rheumatoid.Arthritis.in.prior.year integer A binary column where 1 means the patient had a record for rheumatoid arthritis in the prior year and 0 means they did not
Seizure.in.prior.year integer A binary column where 1 means the patient had a record for a seizure in the prior year and 0 means they did not
Sepsis.in.prior.year integer A binary column where 1 means the patient had a record for sepsis in the prior year and 0 means they did not
Skin.ulcer.in.prior.year integer A binary column where 1 means the patient had a record for a skin ulcer in the prior year and 0 means they did not
Sleep.apnea.in.prior.year integer A binary column where 1 means the patient had a record for sleep apnea in the prior year and 0 means they did not
Smoking.in.prior.year integer A binary column where 1 means the patient had a record for smoking in the prior year and 0 means they did not
STEROIDS.in.prior.year integer A binary column where 1 means the patient had a record for any steroid in the prior year and 0 means they did not
Hemorrhagic.stroke.in.an.inpatient.setting.in.prior.year integer A binary column where 1 means the patient had a record for hemorrhagic stroke in the prior year and 0 means they did not
Non.hemorrhagic.Stroke.in.an.inpatient.setting.in.prior.year integer A binary column where 1 means the patient had a record for non-hemorrhagic stroke in the prior year and 0 means they did not
Urinary.tract.infectious.disease.in.prior.year integer A binary column where 1 means the patient had a record for urinary tract infection in the prior year and 0 means they did not
Antibiotics.Carbapenems.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class carbapenems in the prior year and 0 means they did not
Antibiotics.Aminoglycosides.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class aminoglycosides in the prior year and 0 means they did not
Antibiotics.Cephalosporins.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class cephalosporins in the prior year and 0 means they did not
Antibiotics.Fluoroquinolones.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class Fluoroquinolones in the prior year and 0 means they did not
Antibiotics.Glycopeptides.and.lipoglycopeptides.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class glycopeptides and lipoglycopeptides in the prior year and 0 means they did not
Antibiotics.Macrolides.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class macrolides in the prior year and 0 means they did not
Antibiotics.Monobactams.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class monobactams in the prior year and 0 means they did not
Antibiotics.Oxazolidinones.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class oxazolidinones in the prior year and 0 means they did not
Antibiotics.Penicillins.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class penicillins in the prior year and 0 means they did not
Antibiotics.Polypeptides.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class polypeptides in the prior year and 0 means they did not
Antibiotics.Rifamycins.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class carbapenems in the prior year and 0 means they did not
Antibiotics.Sulfonamides.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class sulfonamides in the prior year and 0 means they did not
Antibiotics.Streptogramins.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class streptogramins in the prior year and 0 means they did not
Antibiotics.Tetracyclines.in.prior.year integer A binary column where 1 means the patient had a record for an antibiotic in the class tetracyclines in the prior year and 0 means they did not
predicted.risk.of.Pulmonary.Embolism numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having pulmonary embolism given their features (0 = 0% and 1= 100%)
predicted.risk.of.Sudden.Hearing.Loss..No.congenital.anomaly.or.middle.or.inner.ear.conditions numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having sudden hearing loss given their features (0 = 0% and 1= 100%)
predicted.risk.of.Restless.Leg.Syndrome numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having restless leg syndrome given their features (0 = 0% and 1= 100%)
predicted.risk.of.Sudden.Vision.Loss..with.no.eye.pathology.causes numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having sudden vision loss given their features (0 = 0% and 1= 100%)
predicted.risk.of.Muscle.weakness.or.injury numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having muscle weakness or injury given their features (0 = 0% and 1= 100%)
predicted.risk.of.Ankylosing.Spondylitis numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having ankylosing spondylitis given their features (0 = 0% and 1= 100%)
predicted.risk.of.Autoimmune.hepatitis numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having autoimmune hepatitis given their features (0 = 0% and 1= 100%)
predicted.risk.of.Multiple.Sclerosis numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having multiple sclerosis given their features (0 = 0% and 1= 100%)
predicted.risk.of.Acute.pancreatitis..with.No.chronic.or.hereditary.or.common.causes.of.pancreatitis numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having acute pancreatitis given their features (0 = 0% and 1= 100%)
predicted.risk.of.Ulcerative.colitis numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having ulcerative colitis given their features (0 = 0% and 1= 100%)
predicted.risk.of.Migraine numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having a migraine given their features (0 = 0% and 1= 100%)
predicted.risk.of.Dementia numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having dementia given their features (0 = 0% and 1= 100%)
predicted.risk.of..Treatment.resistant.depression..TRD. numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having treatment resistant depression given their features (0 = 0% and 1= 100%)
predicted.risk.of.Parkinson.s.disease..inpatient.or.with.2nd.diagnosis numeric A value between 0 and 1 corresponding to the patient's predicted 1-year risk of having Parkinson's disease given their features (0 = 0% and 1= 100%)
jmcastagnetto commented 10 months ago

@jreps Is there a published article (or URL) with more information on this dataset?. I am asking because I would like to consider using it for teaching but will need a reference so it can correctly attributed. Thanks in advance.

jonthegeek commented 10 months ago

Thanks! Closed via #634.