egenn / rtemis

Advanced Machine Learning and Visualization
https://rtemis.org
GNU General Public License v3.0
137 stars 19 forks source link

MissRanger Error #36

Closed ElenaCasiraghi closed 2 years ago

ElenaCasiraghi commented 2 years ago

Dear, I want to impute a dataframe with only binary values (0 -1 ) or integer or double values. The imputed columns have names reported below. when I run missRanger I get the error :

Error in [.data.frame(data, , relevantVars[[1]], drop = FALSE): undefined columns selected

traceback: eval(parse(text = code), envir = envir) train_impute_missRanger(train_vars_cleaned = train_vars_cleaned) missRanger(only_data, pmm.k = num.k, num.trees = ntree, max.depth = max.depth, splitrule = splitrule, sample.fraction = sample.fraction) vapply(data[, relevantVars[[1]], drop = FALSE], FUN.VALUE = TRUE, function(z) anyNA(z) && !all(is.na(z))) data[, relevantVars[[1]], drop = FALSE] [.data.frame(data, , relevantVars[[1]], drop = FALSE) stop("undefined columns selected")

Is this because the column names contain numbers?

[1] "age"
[2] "hp_0000020-urinary_incontinence"
[3] "hp_0000458-anosmia"
[4] "hp_0000572-visual_loss"
[5] "hp_0000716-depressivity"
[6] "hp_0000739-anxiety"
[7] "hp_0000988-skin_rash"
[8] "hp_0001289-confusion"
[9] "hp_0001324-muscle_weakness"
[10] "hp_0001596-alopecia"
[11] "hp_0001742-nasal_obstruction"
[12] "hp_0001888-lymphopenia"
[13] "hp_0001945-fever"
[14] "hp_0001962-palpitations"
[15] "hp_0002013-vomiting"
[16] "hp_0002014-diarrhea"
[17] "hp_0002015-dysphagia"
[18] "hp_0002018-nausea"
[19] "hp_0002027-abdominal_pain"
[20] "hp_0002039-anorexia"
[21] "hp_0002091-restrictive_ventilatory_defect"
[22] "hp_0002094-dyspnea"
[23] "hp_0002110-bronchiectasis"
[24] "hp_0002315-headache"
[25] "hp_0002321-vertigo"
[26] "hp_0002354-memory_impairment"
[27] "hp_0002355-difficulty_walking"
[28] "hp_0002360-sleep_disturbance"
[29] "hp_0002607-bowel_incontinence"
[30] "hp_0002829-arthralgia"
[31] "hp_0003326-myalgia"
[32] "hp_0003546-exercise_intolerance"
[33] "hp_0004396-poor_appetite"
[34] "hp_0006530-abnormal_pulmonary_interstitial_morphology"
[35] "hp_0009710-chilblains"
[36] "hp_0011134-low-grade_fever"
[37] "hp_0011227-elevated_c-reactive_protein_level"
[38] "hp_0012378-fatigue"
[39] "hp_0012384-rhinitis"
[40] "hp_0012531-pain"
[41] "hp_0012735-cough"
[42] "hp_0025095-sneeze"
[43] "hp_0025179-ground-glass_opacification_on_pulmonary_hrct"
[44] "hp_0025337-red_eye"
[45] "hp_0025390-reticular_pattern_on_pulmonary_hrct"
[46] "hp_0025435-increased_lactate_dehydrogenase_level"
[47] "hp_0030766-ear_pain"
[48] "hp_0030879-interlobular_septal_thickening_on_pulmonary_hrct" [49] "hp_0031245-productive_cough"
[50] "hp_0031246-nonproductive_cough"
[51] "hp_0031249-parageusia"
[52] "hp_0031284-flushing"
[53] "hp_0031352-chest_tightness"
[54] "hp_0031417-rhinorrhea"
[55] "hp_0031987-diminished_ability_to_concentrate"
[56] "hp_0032177-parenchymal_consolidation"
[57] "hp_0033047-body_ache"
[58] "hp_0033050-pharyngalgia"
[59] "hp_0041051-ageusia"
[60] "hp_0100749-chest_pain"
[61] "hp_0100785-insomnia"
[62] "hp_bc_0003401_paresthesia"
[63] "hpo_0003401-paresthesia"
[64] "hpo_0025143-chills"
[65] "no_hpo"
[66] "cancer_mass"
[67] "asthma"
[68] "epilepsy"
[69] "asperger"
[70] "autism"
[71] "behavioural_disorder"
[72] "attention_language_disorder"
[73] "obesity"
[74] "leukemia"
[75] "transplanted"
[76] "respiratory_lung_problem"
[77] "renal_problem"
[78] "acute_syndrome"
[79] "cardiovascular"
[80] "no_conditions"
[81] "drug_529118"
[82] "drug_705944"
[83] "drug_753626"
[84] "drug_922802"
[85] "drug_951511"
[86] "drug_967823"
[87] "drug_975125"
[88] "drug_989878"
[89] "drug_1000560"
[90] "drug_1107882"
[91] "drug_1125315"
[92] "drug_1127433"
[93] "drug_1146773"
[94] "drug_1146774"
[95] "drug_1146775"
[96] "drug_1146788"
[97] "drug_1146789"
[98] "drug_1154029"
[99] "drug_1154195"
[100] "drug_1154343"
[101] "drug_1154615"
[102] "drug_1154619"
[103] "drug_1177480"
[104] "drug_1511246"
[105] "drug_1518254"
[106] "drug_1518292"
[107] "drug_1518606"
[108] "drug_1549786"
[109] "drug_1551170"
[110] "drug_1560524"
[111] "drug_1593185"
[112] "drug_1593349"
[113] "drug_1705674"
[114] "drug_1713332"
[115] "drug_1713370"
[116] "drug_1713479"
[117] "drug_1734108"
[118] "drug_1759842"
[119] "drug_1760056"
[120] "drug_1796475"
[121] "drug_2718651"
[122] "drug_19005965"
[123] "drug_19005968"
[124] "drug_19008723"
[125] "drug_19019050"
[126] "drug_19019072"
[127] "drug_19019073"
[128] "drug_19020053"
[129] "drug_19023564"
[130] "drug_19070310"
[131] "drug_19070869"
[132] "drug_19072159"
[133] "drug_19072176"
[134] "drug_19073186"
[135] "drug_19073187"
[136] "drug_19073189"
[137] "drug_19073777"
[138] "drug_19075033"
[139] "drug_19075034"
[140] "drug_19076953"
[141] "drug_19077463"
[142] "drug_19078461"
[143] "drug_19079160"
[144] "drug_19079524"
[145] "drug_19112656"
[146] "drug_19115197"
[147] "drug_19123359"
[148] "drug_19123989"
[149] "drug_19128020"
[150] "drug_19131109"
[151] "drug_19135374"
[152] "drug_35603428"
[153] "drug_35605480"
[154] "drug_35605482"
[155] "drug_36249701"
[156] "drug_36250141"
[157] "drug_40167259"
[158] "drug_40168116"
[159] "drug_40169217"
[160] "drug_40213146"
[161] "drug_40213178"
[162] "drug_40213198"
[163] "drug_40213217"
[164] "drug_40213230"
[165] "drug_40213251"
[166] "drug_40213286"
[167] "drug_40213288"
[168] "drug_40213299"
[169] "drug_40213304"
[170] "drug_40213322"
[171] "drug_40220357"
[172] "drug_40221381"
[173] "drug_40227012"
[174] "drug_40228087"
[175] "drug_40228203"
[176] "drug_40228214"
[177] "drug_40232435"
[178] "drug_40232756"
[179] "drug_40233964"
[180] "drug_40241046"
[181] "drug_40241504"
[182] "drug_42707627"
[183] "drug_42901928"
[184] "drug_46287338"
[185] "no_drugs"
[186] "count_missing"
[187] "gender_male"
[188] "ethnicity_Hispanic_or_latino"
[189] "race_white"
[190] "race_black"
[191] "race_asian"
[192] "race_islander"
[193] "wt"

egenn commented 2 years ago

Dashes "-" in column names cause this problem so you have to remove them. Are you using preprocess() or missRanger() directly?

ElenaCasiraghi commented 2 years ago

you are right! I soon realized it by using readr to open the dataframe saved into a file :) thanks!