Closed vinayduggi closed 3 years ago
I also tried to fix this issue by beforehand joining the feature column with view column so that while loading the model after training into R environment it doesn't need to again specifically add suffix names. This should have potentially resolved the issue. But again faced the same error. Attaching the dataframe before model training using which i tried to fix the above issue.
0 C3L-00104 COAD A1BG_CNV -0.628254 CNV
1 C3L-00365 COAD A1BG_CNV 0.304944 CNV
2 C3L-00674 COAD A1BG_CNV 0.247958 CNV
3 C3L-00677 COAD A1BG_CNV 0.271067 CNV
4 C3L-01040 COAD A1BG_CNV 0.351638 CNV
... ... ... ... ... ...
252583 C3L-03728 COAD yR211F11.2_ENSG00000213076.3_transcriptomics 1884.303750 transcriptomics
252584 C3L-03744 COAD yR211F11.2_ENSG00000213076.3_transcriptomics 23137.436372 transcriptomics
252585 C3L-03748 COAD yR211F11.2_ENSG00000213076.3_transcriptomics 6814.774431 transcriptomics
252586 C3L-03968 COAD yR211F11.2_ENSG00000213076.3_transcriptomics 17126.451805 transcriptomics
252587 C3L-04084 COAD yR211F11.2_ENSG00000213076.3_transcriptomics 4295.846346 transcriptomics
The MOFA+ model training also went smoothly. Warning: some view(s) have less than 15 features, MOFA won't be able to learn meaningful factors for these view(s)...
######################################
## Training the model with seed 1 ##
######################################
ELBO before training: -59708145.59
Iteration 1: time=2.61, ELBO=-7324248.07, deltaELBO=52383897.525 (87.73325148%), Factors=9
Iteration 2: time=2.28, ELBO=1092829.34, deltaELBO=8417077.407 (14.09703370%), Factors=9
Iteration 3: time=2.27, ELBO=2880597.57, deltaELBO=1787768.226 (2.99417811%), Factors=9
Iteration 4: time=2.28, ELBO=3031276.82, deltaELBO=150679.254 (0.25235963%), Factors=9
Iteration 5: time=2.28, ELBO=3119178.35, deltaELBO=87901.526 (0.14721865%), Factors=9
Iteration 6: time=2.20, ELBO=3180074.42, deltaELBO=60896.068 (0.10198955%), Factors=9
Iteration 7: time=2.19, ELBO=3225162.05, deltaELBO=45087.635 (0.07551337%), Factors=9
Iteration 8: time=2.22, ELBO=3258604.68, deltaELBO=33442.630 (0.05601016%), Factors=9
Iteration 9: time=2.22, ELBO=3283035.41, deltaELBO=24430.732 (0.04091692%), Factors=9
Iteration 10: time=2.19, ELBO=3300812.79, deltaELBO=17777.381 (0.02977380%), Factors=9
Iteration 11: time=2.19, ELBO=3314128.72, deltaELBO=13315.922 (0.02230168%), Factors=9
Iteration 12: time=2.18, ELBO=3324688.34, deltaELBO=10559.620 (0.01768539%), Factors=9
Iteration 13: time=2.20, ELBO=3333347.98, deltaELBO=8659.644 (0.01450329%), Factors=9
Iteration 14: time=2.21, ELBO=3340384.39, deltaELBO=7036.409 (0.01178467%), Factors=9
Iteration 15: time=2.25, ELBO=3346081.83, deltaELBO=5697.444 (0.00954215%), Factors=9
Iteration 16: time=2.18, ELBO=3350732.47, deltaELBO=4650.632 (0.00778894%), Factors=9
Iteration 17: time=2.22, ELBO=3354544.42, deltaELBO=3811.957 (0.00638432%), Factors=9
Iteration 18: time=2.27, ELBO=3357705.70, deltaELBO=3161.276 (0.00529455%), Factors=9
Iteration 19: time=2.27, ELBO=3360392.59, deltaELBO=2686.893 (0.00450004%), Factors=9
Iteration 20: time=2.28, ELBO=3362742.06, deltaELBO=2349.464 (0.00393491%), Factors=9
Iteration 21: time=2.22, ELBO=3364842.99, deltaELBO=2100.930 (0.00351867%), Factors=9
Iteration 22: time=2.24, ELBO=3366751.02, deltaELBO=1908.030 (0.00319559%), Factors=9
Iteration 23: time=2.28, ELBO=3368504.12, deltaELBO=1753.101 (0.00293612%), Factors=9
Iteration 24: time=2.24, ELBO=3370130.35, deltaELBO=1626.237 (0.00272364%), Factors=9
Iteration 25: time=2.25, ELBO=3371651.73, deltaELBO=1521.375 (0.00254802%), Factors=9
Iteration 26: time=2.29, ELBO=3373086.02, deltaELBO=1434.287 (0.00240216%), Factors=9
Iteration 27: time=2.29, ELBO=3374447.79, deltaELBO=1361.775 (0.00228072%), Factors=9
Iteration 28: time=2.29, ELBO=3375748.94, deltaELBO=1301.151 (0.00217919%), Factors=9
Iteration 29: time=2.24, ELBO=3376999.55, deltaELBO=1250.606 (0.00209453%), Factors=9
Iteration 30: time=2.31, ELBO=3378208.64, deltaELBO=1209.094 (0.00202501%), Factors=9
Iteration 31: time=2.28, ELBO=3379383.94, deltaELBO=1175.297 (0.00196840%), Factors=9
Iteration 32: time=2.27, ELBO=3380531.46, deltaELBO=1147.525 (0.00192189%), Factors=9
Iteration 33: time=2.27, ELBO=3381656.51, deltaELBO=1125.044 (0.00188424%), Factors=9
Iteration 34: time=2.28, ELBO=3382764.09, deltaELBO=1107.585 (0.00185500%), Factors=9
Iteration 35: time=2.27, ELBO=3383860.43, deltaELBO=1096.335 (0.00183616%), Factors=9
Iteration 36: time=2.26, ELBO=3384951.91, deltaELBO=1091.483 (0.00182803%), Factors=9
Iteration 37: time=2.24, ELBO=3386028.14, deltaELBO=1076.232 (0.00180249%), Factors=9
Iteration 38: time=2.26, ELBO=3387072.26, deltaELBO=1044.120 (0.00174871%), Factors=9
Iteration 39: time=2.28, ELBO=3388070.44, deltaELBO=998.176 (0.00167176%), Factors=9
Iteration 40: time=2.29, ELBO=3389028.11, deltaELBO=957.672 (0.00160392%), Factors=9
Iteration 41: time=2.27, ELBO=3389965.85, deltaELBO=937.741 (0.00157054%), Factors=9
Iteration 42: time=2.28, ELBO=3390898.08, deltaELBO=932.225 (0.00156130%), Factors=9
Iteration 43: time=2.28, ELBO=3391830.44, deltaELBO=932.359 (0.00156153%), Factors=9
Iteration 44: time=2.28, ELBO=3392764.80, deltaELBO=934.364 (0.00156488%), Factors=9
Iteration 45: time=2.28, ELBO=3393701.93, deltaELBO=937.128 (0.00156951%), Factors=9
Iteration 46: time=2.29, ELBO=3394642.38, deltaELBO=940.452 (0.00157508%), Factors=9
Iteration 47: time=2.22, ELBO=3395586.92, deltaELBO=944.539 (0.00158193%), Factors=9
Iteration 48: time=2.24, ELBO=3396536.38, deltaELBO=949.464 (0.00159017%), Factors=9
Iteration 49: time=2.22, ELBO=3397491.70, deltaELBO=955.319 (0.00159998%), Factors=9
Iteration 50: time=2.18, ELBO=4301925.13, deltaELBO=904433.425 (1.51475718%), Factors=9
Iteration 51: time=2.26, ELBO=4357187.19, deltaELBO=55262.063 (0.09255364%), Factors=9
Iteration 52: time=2.21, ELBO=4365975.64, deltaELBO=8788.445 (0.01471901%), Factors=9
Iteration 53: time=2.22, ELBO=4369320.62, deltaELBO=3344.985 (0.00560222%), Factors=9
Iteration 54: time=2.12, ELBO=4371111.54, deltaELBO=1790.919 (0.00299945%), Factors=9
Iteration 55: time=2.21, ELBO=4372245.67, deltaELBO=1134.126 (0.00189945%), Factors=9
Iteration 56: time=2.25, ELBO=4373049.05, deltaELBO=803.380 (0.00134551%), Factors=9
Iteration 57: time=2.26, ELBO=4373672.39, deltaELBO=623.345 (0.00104399%), Factors=9
Iteration 58: time=2.26, ELBO=4374193.65, deltaELBO=521.257 (0.00087301%), Factors=9
Iteration 59: time=2.25, ELBO=4374654.35, deltaELBO=460.706 (0.00077160%), Factors=9
Iteration 60: time=2.26, ELBO=4375078.78, deltaELBO=424.422 (0.00071083%), Factors=9
Iteration 61: time=2.21, ELBO=4375480.15, deltaELBO=401.375 (0.00067223%), Factors=9
Iteration 62: time=2.20, ELBO=4375867.77, deltaELBO=387.616 (0.00064918%), Factors=9
Iteration 63: time=2.19, ELBO=4376243.15, deltaELBO=375.388 (0.00062870%), Factors=9
Iteration 64: time=2.20, ELBO=4376612.10, deltaELBO=368.948 (0.00061792%), Factors=9
Iteration 65: time=2.24, ELBO=4376978.71, deltaELBO=366.611 (0.00061400%), Factors=9
Iteration 66: time=2.21, ELBO=4377345.44, deltaELBO=366.724 (0.00061419%), Factors=9
Iteration 67: time=2.17, ELBO=4377716.19, deltaELBO=370.752 (0.00062094%), Factors=9
Iteration 68: time=2.20, ELBO=4378096.97, deltaELBO=380.779 (0.00063773%), Factors=9
Iteration 69: time=2.25, ELBO=4378495.32, deltaELBO=398.352 (0.00066716%), Factors=9
Iteration 70: time=2.25, ELBO=4378920.04, deltaELBO=424.718 (0.00071132%), Factors=9
Iteration 71: time=2.22, ELBO=4379383.57, deltaELBO=463.529 (0.00077632%), Factors=9
Iteration 72: time=2.23, ELBO=4379910.23, deltaELBO=526.661 (0.00088206%), Factors=9
Iteration 73: time=2.26, ELBO=4380538.12, deltaELBO=627.892 (0.00105160%), Factors=9
Iteration 74: time=2.25, ELBO=4381328.49, deltaELBO=790.369 (0.00132372%), Factors=9
Iteration 75: time=2.31, ELBO=4382392.45, deltaELBO=1063.966 (0.00178194%), Factors=9
Iteration 76: time=2.27, ELBO=4383918.79, deltaELBO=1526.338 (0.00255633%), Factors=9
Iteration 77: time=2.24, ELBO=4386179.86, deltaELBO=2261.066 (0.00378686%), Factors=9
Iteration 78: time=2.24, ELBO=4389309.52, deltaELBO=3129.657 (0.00524159%), Factors=9
Iteration 79: time=2.21, ELBO=4392565.87, deltaELBO=3256.351 (0.00545378%), Factors=9
Iteration 80: time=2.16, ELBO=4394563.82, deltaELBO=1997.958 (0.00334621%), Factors=9
Iteration 81: time=2.21, ELBO=4395454.18, deltaELBO=890.357 (0.00149118%), Factors=9
Iteration 82: time=2.20, ELBO=4395904.98, deltaELBO=450.801 (0.00075501%), Factors=9
Iteration 83: time=2.20, ELBO=4396234.21, deltaELBO=329.230 (0.00055140%), Factors=9
Iteration 84: time=2.16, ELBO=4396536.22, deltaELBO=302.009 (0.00050581%), Factors=9
Iteration 85: time=2.19, ELBO=4396818.96, deltaELBO=282.738 (0.00047353%), Factors=9
Iteration 86: time=2.24, ELBO=4397085.03, deltaELBO=266.074 (0.00044562%), Factors=9
Converged!
#######################
## Training finished ##
#######################
Saving model in /*****/****/projects/mofapy2/mofapy2/models/COAD_cptac_gbm.hdf5...
Hey @vinayduggi, thanks for opening the issue, not sure I can reproduce it from the available info. Are there, by any chance, duplicated feature names in a view?
feature_names <- h5read('COAD_cptac_gbm.hdf5', 'features')
lapply(feature_names, function(e) e[duplicated(e)])
Hey @gtca .....thanks for the prompt response...here's what i got from the above command......should i remove the below entries from the respective view??
$CNV
$SNV
$acetylproteomics
'CREBBP_K1583K1586K1587K1588_EESTAASETTEGSQGDSK#NAK''GAPDH_K271_QASEGPLK#GILGYTEHQVVSSDFN$SDTHSSTFDAGAG''SMARCC2_K373_DSESAPVK#GGTMTDLDEQEDESM*ETTGKDEDENST''SMARCC2_K373_DSESAPVK#GGTMTDLDEQEDESMETTGKDEDEN$ST'
$cirRNA
$lipidomics
$metabolomics
$miRNA
$phosphoproteomics
'ACIN1_S169S181_EAAELEEASAES*EDEMIHPEGVAS*LLPPDFQSS''AGAP2_S94S105_QDALWISTSSAGTGGAEPPALS*PAPASPARPVS*P''AKAP12_S154_SAVVHDITDDGQEETPEIIEQIPSSES*NLEELTQPTE''AKAP12_S742S743_ETGTDGILAGSQEHDPGQGSS*S*PEQAGSPTEG''AKAP12_S742S743T751_ETGTDGILAGSQEHDPGQGSS*S*PEQAGS''AKAP12_S743_ETGTDGILAGSQEHDPGQGSSS*PEQAGSPTEGEGVST''AKAP12_S743T751_ETGTDGILAGSQEHDPGQGSSS*PEQAGSPT*EG''AKAP12_S749_ETGTDGILAGSQEHDPGQGSSSPEQAGS*PTEGEGVST''AKAP12_S749T751_ETGTDGILAGSQEHDPGQGSSSPEQAGS*PT*EG''AKAP12_T135_SAVVHDIT*DDGQEETPEIIEQIPSSESNLEELTQPTE''AKAP12_T160_SAVVHDITDDGQEETPEIIEQIPSSESNLEELT*QPTE''AKAP12_T751_ETGTDGILAGSQEHDPGQGSSSPEQAGSPT*EGEGVST''ANK2_S3817S3818S3823_LYLQTPTS*S*ERGGS*PIIQEPEEPSEH''ANK2_T3814T3816S3823_LYLQT*PT*SSERGGS*PIIQEPEEPSEH''ANK2_T3816S3818S3823_LYLQTPT*SS*ERGGS*PIIQEPEEPSEH''AP3D1_S759S764S788_HSS*LPTES*DEDIAPAQQVDIVTEEMPENA''AP3D1_S764S788_HSSLPTES*DEDIAPAQQVDIVTEEMPENALPS*D''AP3D1_T903_KAEDLDFWLSTTPPPAPAPAPAPVPSTGELSVNTVTT*P''ARAP1_S207_EEESLLPSLSSPPQPQSEEPLSTLPQGPPQPPS*PPPCP''ARAP1_T197_EEESLLPSLSSPPQPQSEEPLST*LPQGPPQPPSPPPCP''ARHGAP12_S338_GDFQNPGDQELLSSEENYYSTSYSQSDSQCGS*PPR''ARHGAP17_S625_NNSQIASGQNQPQAAAGSHQLSMGQPHNAAGPS*PH''ARHGAP39_S488_HSQPPTPLPQAQEDAMSWSSQQDTLSSTGYS*PGTR''ARHGAP44_S604_GS*PGSSQGTACAGTQPGAQPGAQPGASPSPSQPPA''ARHGEF2_S941S947S960_LQDSS*DPDTGS*EEEGSSRLSPPHS*PR''ARHGEF2_S941S952S960_LQDSS*DPDTGSEEEGS*SRLSPPHS*PR''ARHGEF2_S941T945S947_LQDSS*DPDT*GS*EEEGSSRLSPPHSPR''ARHGEF26_S172_TPNAPAPCTPEEDLTGLTASPVPS*PTANGLAANND''ATCAY_S39_EEWQDEDLPRPLPEETGVELLGS*PVEDTSSPPNTLNFNG''ATCAY_S39S46_EEWQDEDLPRPLPEETGVELLGS*PVEDTSS*PPNTL''ATCAY_S45S46_EEWQDEDLPRPLPEETGVELLGSPVEDTS*S*PPNTL''ATCAY_S46_EEWQDEDLPRPLPEETGVELLGSPVEDTSS*PPNTLNFNG''ATCAY_T32S39_EEWQDEDLPRPLPEET*GVELLGS*PVEDTSSPPNTL''ATCAY_T50_EEWQDEDLPRPLPEETGVELLGSPVEDTSSPPNT*LNFNG''ATXN2L_S32_RPPGGTS*PPNGGLPGPLATSAAPPGPPAAASPCLGPVA''BCLAF1_S272S274S276_S*GS*GS*VGNGSSRYSPSQNSPIHHIPSR''BCLAF1_S281S282Y284_SGSGSVGNGS*S*RY*SPSQNSPIHHIPSR''BCLAF1_S282Y284S285_SGSGSVGNGSS*RY*S*PSQNSPIHHIPSR''BCLAF1_Y284S285S287_SGSGSVGNGSSRY*S*PS*QNSPIHHIPSR''CEP170_S928S930S933_TDEGPDTPSYNRDNS*IS*PES*DVDTAST''CEP170_T920S922Y923_TDEGPDT*PS*Y*NRDNSISPESDVDTAST''CEP170_Y923S928_TDEGPDTPSY*NRDNS*ISPESDVDTASTISLVT''CEP170_Y923S928S933_TDEGPDTPSY*NRDNS*ISPES*DVDTAST''CHD4_S103S105S108_QLGDSSGEGPEFVEEEEEVALRS*DS*EGS*D''CHD4_S105S108Y110_QLGDSSGEGPEFVEEEEEVALRSDS*EGS*DY''DENND4C_S1608S1610_GSASFFLKPSTSGDSLQS*GS*IPLANESLE''DLGAP4_Y761_NLSY*GDNSDPALEASSLPPPDPWLETSSSSPAEPAQP''DNAJC6_S510S513_SFCEEDHAALVNQES*EQS*DDELLTLSSPHGNA''DNAJC6_S510S522_SFCEEDHAALVNQES*EQSDDELLTLSS*PHGNA''DNAJC6_S510T519_SFCEEDHAALVNQES*EQSDDELLT*LSSPHGNA''DNAJC6_S513S522_SFCEEDHAALVNQESEQS*DDELLTLSS*PHGNA''DNAJC6_S513T519_SFCEEDHAALVNQESEQS*DDELLT*LSSPHGNA''DNAJC6_S521S522_SFCEEDHAALVNQESEQSDDELLTLS*S*PHGNA''DNM2_S764_EALNIIGDISTSTVSTPVPPPVDDTWLQSASSHS*PTPQR''EDC4_S555_FQPQLNPDVVAPLPTHTAHEDFTFGESRPELGS*EGLGSA''EDC4_T537_FQPQLNPDVVAPLPT*HTAHEDFTFGESRPELGSEGLGSA''EED_S34_LSSDENSNPDLS*GDENDDAVSIESGTNTERPDTPTNTPNAP''EED_S34T55_LSSDENSNPDLS*GDENDDAVSIESGTNTERPDT*PTNT''EED_T50T55_LSSDENSNPDLSGDENDDAVSIESGTNT*ERPDT*PTNT''EEF1B2_S106_YGPADVEDTTGSGATDSKDDDDIDLFGS*DDEEESEEA''EIF4EBP1_S44T45_VVLGDGVQLPPGDYSTTPGGTLFS*T*TPGGTRI''EIF4EBP1_T36T37_VVLGDGVQLPPGDYST*T*PGGTLFSTTPGGTRI''EIF4EBP1_T36T45_RVVLGDGVQLPPGDYST*TPGGTLFST*TPGGTR''EIF4EBP1_T37T41_VVLGDGVQLPPGDYSTT*PGGT*LFSTTPGGTRI''EIF4EBP1_T37T46_VVLGDGVQLPPGDYSTT*PGGTLFSTT*PGGTRI''EIF4EBP1_T45T46_VVLGDGVQLPPGDYSTTPGGTLFST*T*PGGTRI''EIF4EBP1_T46Y54_VVLGDGVQLPPGDYSTTPGGTLFSTT*PGGTRII''EIF4EBP2_T36T41_TVAISDAAQLPHDYCT*TPGGT*LFSTTPGGTRI''EIF4EBP2_T36T45_TVAISDAAQLPHDYCT*TPGGTLFST*TPGGTRI''EIF4EBP2_T36T46_TVAISDAAQLPHDYCT*TPGGTLFSTT*PGGTRI''EIF4EBP2_T45T46_TVAISDAAQLPHDYCTTPGGTLFST*T*PGGTRI''EIF5_T178_ENGSVSSSET*PPPPPPPNEINPPPHTMEEEEDDDWGEDT''EP400_S961S962_LYEGAFLPS*S*QWPRPKPDGEDTSGEEDADDCPG''EP400_T974S975_LYEGAFLPSSQWPRPKPDGEDT*S*GEEDADDCPG''EPB41L1_S430S437_S*LDGAEFS*RPASVSENHDAGPDGDKRDEDGE''EPB41L1_S430S441S443_S*LDGAEFSRPAS*VS*ENHDAGPDGDKR''EPB41L1_S430S441S443_S*LDGAEFSRPAS*VS*ENHDAGPDGDKR''EPB41L1_S437_SLDGAEFS*RPASVSENHDAGPDGDKRDEDGESGGQR''EPB41L1_S437S441S443_SLDGAEFS*RPAS*VS*ENHDAGPDGDKR''EPB41L1_S437S443_SLDGAEFS*RPASVS*ENHDAGPDGDKRDEDGE''EPB41L1_S441_SLDGAEFSRPAS*VSENHDAGPDGDKRDEDGESGGQR''EPB41L1_S441S443_SLDGAEFSRPAS*VS*ENHDAGPDGDKRDEDGE''EPB41L1_S443_SLDGAEFSRPASVS*ENHDAGPDGDKRDEDGESGGQR''EPB41L1_S443S461_SLDGAEFSRPASVS*ENHDAGPDGDKRDEDGES''EPB41L1_S461_SLDGAEFSRPASVSENHDAGPDGDKRDEDGES*GGQR''EPB41L1_S461S466_SLDGAEFSRPASVSENHDAGPDGDKRDEDGES*''EPB41L5_S420S436_SALPVSPS*ISSAPVPVEIENLPQS*PGTDQHD''EPB41L5_S423S436_SALPVSPSISS*APVPVEIENLPQS*PGTDQHD''FAM129A_S577S579_HNLFEDNMALPS*ES*VSSLTDLKPPTGSNQAS''FAM129A_S577S579S596_HNLFEDNMALPS*ES*VSSLTDLKPPTGS''FAM129A_S577S579T584_HNLFEDNMALPS*ES*VSSLT*DLKPPTG''FAM129A_S582S592_HNLFEDNMALPSESVSS*LTDLKPPTGS*NQAS''FAM129A_S582T584_HNLFEDNMALPSESVSS*LT*DLKPPTGSNQAS''FAM129A_S582T584S596_HNLFEDNMALPSESVSS*LT*DLKPPTGS''FAM129A_S592S596_HNLFEDNMALPSESVSSLTDLKPPTGS*NQAS*''FAM129A_T584S592S596_HNLFEDNMALPSESVSSLT*DLKPPTGS*''FAM129A_T584S596_HNLFEDNMALPSESVSSLT*DLKPPTGSNQAS*''FAM129A_T584T590_HNLFEDNMALPSESVSSLT*DLKPPT*GSNQAS''FAM129A_T584T590S596_HNLFEDNMALPSESVSSLT*DLKPPT*GS''FAM129A_T590S592_HNLFEDNMALPSESVSSLTDLKPPT*GS*NQAS''FAM129A_T590S596_HNLFEDNMALPSESVSSLTDLKPPT*GSNQAS*''FAM171A2_S789T791S792_SSASELRRDS*LT*S*PEDELGAEVGDE''FARP1_S403_SLASQPTELNSEVLEQSQQSTSLTFGEGAES*PGGQSCR''FMR1_S497S500_RGPGYTSGTNSEAS*NAS*ETESDHRDELSDWSLAP''FMR1_S500_RGPGYTSGTNSEASNAS*ETESDHRDELSDWSLAPTEEER''FMR1_S500S511_RGPGYTSGTNSEASNAS*ETESDHRDELS*DWSLAP''FMR1_S511_RGPGYTSGTNSEASNASETESDHRDELS*DWSLAPTEEER''GIGYF2_S357S359_EPIPEEQEMDFRPVDEGEECS*DS*EGSHNEEAK''GIGYF2_S357S359S362_EPIPEEQEMDFRPVDEGEECS*DS*EGS*H''GIGYF2_S359S362_EPIPEEQEMDFRPVDEGEECSDS*EGS*HNEEAK''HDGF_S229T248_NSTPSEPGS*GRGPPQEEEEEEDEEEEAT*KEDAEA''HDGFL3_S121S122_FTGYQAIQQQSSSETEGEGGNTADAS*S*EEEGD''HNRNPC_S253S260_MES*EGGADDS*AEEGDLLDDDDNEDRGDDQLEL''HSP90B1_T774_VEEEPEEEPEETAEDTTEDT*EQDEDEEMDVGTDEEE''HSPH1_S556S559_NVQQDNSEAGTQPQVQTDAQQTSQS*PPS*PELTS''HUWE1_S2953_GILEEPLPSTSS*EEEDPLAGISLPEGVDPSFLAALPD''IRS2_S384S388_TASEGDGGAAAGAAAAGARPVS*VAGS*PLSPGPVR''KIAA1109_S4304_LFLGDQTINLPTSGPGTPDSIEGVS*QHLSPESSR''KIAA1109_S4308_LFLGDQTINLPTSGPGTPDSIEGVSQHLS*PESSR''KIAA1191_Y33S52_AVSY*DDTLEDPAPMTPPPSDMGS*VPWKPVIPE''KIF1A_S1531_LETAQRPVPEALSPAFSEDSESHGSSSASS*PLSAEGR''KIF2A_S624_ELTVDPTAAGDVRPIMHHPPNQIDDLETQWGVGSS*PQR''KIF2A_T617_ELTVDPTAAGDVRPIMHHPPNQIDDLET*QWGVGSSPQR''LCP2_S410_NFPLPLPNKPRPPS*PAEEENSLNEEWYVSYITRPEAEAA''LCP2_S417_NFPLPLPNKPRPPSPAEEENS*LNEEWYVSYITRPEAEAA''LCP2_T428_NFPLPLPNKPRPPSPAEEENSLNEEWYVSYIT*RPEAEAA''LCP2_Y426_NFPLPLPNKPRPPSPAEEENSLNEEWYVSY*ITRPEAEAA''MAP1A_S2408_SSRPDTLLS*PEQPVCPAGGSGGPPSSASPEVEAGPQG''MAP1A_S2408S2419_SSRPDTLLS*PEQPVCPAGGS*GGPPSSASPEV''MAP1A_S2408S2419S2427_SSRPDTLLS*PEQPVCPAGGS*GGPPSS''MAP1A_S2408S2427_SSRPDTLLS*PEQPVCPAGGSGGPPSSAS*PEV''MAP1A_S2425S2427_SSRPDTLLSPEQPVCPAGGSGGPPSS*AS*PEV''MAP1A_S2427_SSRPDTLLSPEQPVCPAGGSGGPPSSAS*PEVEAGPQG''MAP1A_T2042_SLQSDTPTFSYAALAGPT*VPPRPEPGPSMEPSLTPPA''MAP1A_T2058_SLQSDTPTFSYAALAGPTVPPRPEPGPSMEPSLT*PPA''MAP1B_S1396S1400S1408_VLS*PLRS*PPLIGSES*AYESFLSADD''MAP4K4_S324S326_DETEYEYS*GS*EEEEEEVPEQEGEPSSIVNVPG''MAP4K4_S326_DETEYEYSGS*EEEEEEVPEQEGEPSSIVNVPGESTLR''MAP4K4_S341S342_DETEYEYSGSEEEEEEVPEQEGEPS*S*IVNVPG''MARCKS_S118T120_EAPAEGEAAEPGS*PT*AAEGEAASAASSTSSPK''MARCKS_S118T120S132_EAPAEGEAAEPGS*PT*AAEGEAASAASS*''MARCKS_S128S131S132_EAPAEGEAAEPGSPTAAEGEAAS*AAS*S*''MARCKS_S131S132_EAPAEGEAAEPGSPTAAEGEAASAAS*S*TSSPK''MARCKS_T120S128_EAPAEGEAAEPGSPT*AAEGEAAS*AASSTSSPK''MARCKS_T120S131S132_EAPAEGEAAEPGSPT*AAEGEAASAAS*S*''MARCKS_T120S132_EAPAEGEAAEPGSPT*AAEGEAASAASS*TSSPK''MARCKS_T143S145S147_EAPAEGEAAEPGSPTAAEGEAASAASSTSS''MEAF6_S122_REPGSGTES*DTSPDFHNQENEPSQEDPEDLDGSVQGVK''MEAF6_S125_REPGSGTESDTS*PDFHNQENEPSQEDPEDLDGSVQGVK''MEAF6_S125S136_REPGSGTESDTS*PDFHNQENEPS*QEDPEDLDGS''MEAF6_S136_REPGSGTESDTSPDFHNQENEPS*QEDPEDLDGSVQGVK''MEAF6_T124S125_REPGSGTESDT*S*PDFHNQENEPSQEDPEDLDGS''MEAF6_T124S125S136_REPGSGTESDT*S*PDFHNQENEPS*QEDPE''MIA3_S1678_RGPLSQNGSFGPSPVSGGECS*PPLTVEPPVRPLSATLN''MIA3_T1682_RGPLSQNGSFGPSPVSGGECSPPLT*VEPPVRPLSATLN''MICAL1_S786_AEGSDRGPES*PELPTPSENSMPPGLSTPTASQEGAGP''MINK1_S324S326_EETEYEYS*GS*EEEDDSHGEEGEPSSIMNVPGES''MINK1_S324S326S332_EETEYEYS*GS*EEEDDS*HGEEGEPSSIMN''MINK1_S326S332_EETEYEYSGS*EEEDDS*HGEEGEPSSIMNVPGES''MON2_S1182_SFQEILQIVSPVRDS*DKPETPPVVNVPVPVLIGPISGM''MTSS1_S653_GEHSPESPS*VGEGPQGVTSMPSSMWSGQASVNPPLPGP''MTSS1L_S639S649_RLS*LPNTAWGSPS*PEAAGYPGAGAEDEQQQLA''MYLK_S1419_AINVYGTSEPSQES*ELTTVGEKPEEPKDEVEVSDDDEK''MYLK_S1438_AINVYGTSEPSQESELTTVGEKPEEPKDEVEVS*DDDEK''NASP_S127_MENGVLGNALEGVHVEEEEGEKTEDES*LVENNDNIDEEA''NASP_T123_MENGVLGNALEGVHVEEEEGEKT*EDESLVENNDNIDEEA''NASP_T123S127_MENGVLGNALEGVHVEEEEGEKT*EDES*LVENNDN''NCBP3_S30_AEAPAGPALGLPSPEAES*GVDRGEPEPMEVEEGELEIVP''NCOR2_S2413S2420S2432_AKS*PAPGLAS*GDRPPSVSSVHS*EGD''NES_S1492S1496S1498_TALETESQDS*AEPS*GS*EEESDPVSLER''NES_S1496S1498S1502_TALETESQDSAEPS*GS*EEES*DPVSLER''NES_S1496S1498S1506_TALETESQDSAEPS*GS*EEESDPVS*LER''NES_S1496S1502S1506_TALETESQDSAEPS*GSEEES*DPVS*LER''NFIX_S288S318_S*IDDSEMESPVDDVFYPGTGRSPAAGSSQSS*GWP''NFIX_S318_SIDDSEMESPVDDVFYPGTGRSPAAGSSQSS*GWPNDVDA''NOL9_S84T90S97_RPNTATPS*PIPSPT*PASEPES*EPELESASSCH''NOS1AP_S417_SGALPVLCDPTTPKPEDLHSPPLGAGLADFAHPAGS*P''NRBP1_S436_NGIYPLTAFGLPRPQQPQQEEVTSPVVPPS*VKTPTPEP''NRBP1_T429_NGIYPLTAFGLPRPQQPQQEEVT*SPVVPPSVKTPTPEP''NRBP1_T439_NGIYPLTAFGLPRPQQPQQEEVTSPVVPPSVKT*PTPEP''NRBP1_T439T441_NGIYPLTAFGLPRPQQPQQEEVTSPVVPPSVKT*P''NRCAM_S1251S1254Y1258_KEDS*DDS*LVDY*GEGVNGQFNEDGSF''NRCAM_S1251Y1258_KEDS*DDSLVDY*GEGVNGQFNEDGSFIGQYSG''NRCAM_S1251Y1258S1271_KEDS*DDSLVDY*GEGVNGQFNEDGS*F''NRCAM_S1254S1271_KEDSDDS*LVDYGEGVNGQFNEDGS*FIGQYSG''NRCAM_S1254Y1258_KEDSDDS*LVDY*GEGVNGQFNEDGSFIGQYSG''NRCAM_S1254Y1258S1271_KEDSDDS*LVDY*GEGVNGQFNEDGS*F''PACSIN2_S356_PSSTLNVPSNPAQS*AQSQSSYNPFEDEDDTGSTVSE''PACSIN2_S359_PSSTLNVPSNPAQSAQS*QSSYNPFEDEDDTGSTVSE''PACSIN2_S359Y363_PSSTLNVPSNPAQSAQS*QSSY*NPFEDEDDTG''PACSIN2_S377_PSSTLNVPSNPAQSAQSQSSYNPFEDEDDTGSTVS*E''PAK4_S291_GAPSPGVLGPHASEPQLAPPACTPAAPAVPGPPGPRS*PQ''PALM2_T378T379_TVTDVSTIDGNAAELVSGRPVSDT*T*EPSSPEGK''PAXBP1_T68_APGGESLLGPGPSPPSALT*PGLGAEAGGGFPGGAEPGN''PDZD2_S997_VGCYDANDASDEEEFDREGDCISLPGALPGPIRPLS*ED''PKN1_S582_SSRDPPSSPSS*LSSPIQESTAPELPSETQETPGPALCSP''PKN1_S585_SSRDPPSSPSSLSS*PIQESTAPELPSETQETPGPALCSP''PKP4_S1048S1049_SHPSLSTTNQQMSPIIQSVGSTSS*S*PALLGIR''PLCB1_S494S495_LSEQASNTYSDSS*S*MFEPSSPGAGEADTESDDD''PLCB1_S495S501_LSEQASNTYSDSSS*MFEPSS*PGAGEADTESDDD''PLCB1_S500S501_LSEQASNTYSDSSSMFEPS*S*PGAGEADTESDDD''PLCB1_S511_LSEQASNTYSDSSSMFEPSSPGAGEADTES*DDDDDDDD''PLCB1_T509S511_LSEQASNTYSDSSSMFEPSSPGAGEADT*ES*DDD''PLEKHA6_S278_VPGGGEQPAQPNGWQYHSPS*RPGSTAFPSQDGETGG''PLEKHA7_S867S871_TVPLFPHPPVPSLSTSESKPPPQPS*PPTS*PV''PPP1R3F_S233_SPPWAGAGGTGAGDPILDPGLGLGPGQASASS*PDDG''PRKCI_S544_QVVPPFKPNIS*GEFGLDNFDSQFTNEPVQLTPDDDDIV''PRKD3_S213_RLS*NVSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSK''PRRC2A_S342S350_LKFS*DEEDGRDS*DEEGAEGHRDSQSASGEERP''PRRC2A_S342S363S365_LKFS*DEEDGRDSDEEGAEGHRDSQS*AS*''PRRC2A_S342S365_LKFS*DEEDGRDSDEEGAEGHRDSQSAS*GEERP''PRRC2A_S350S365_LKFSDEEDGRDS*DEEGAEGHRDSQSAS*GEERP''PRRC2C_T1498S1500S1503_TPDLSNQNSSDQANEEWET*AS*ESS*''PRRC2C_T1498S1502S1503_TPDLSNQNSSDQANEEWET*ASES*S*''PTPN23_S1576_EEPPVPEAPSSGPPSSS*LELLASLTPEAFSLDSSLR''RAPGEF2_S1277S1281S1285_QAEDTISNASSQLSS*PPTS*PQSS*''RB1_S608S612_DREGPTDHLESACPLNLPLQNNHTAADMYLS*PVRS*''RETREG2_S281S283_NAPPGGDEPLAETES*ES*EAELAGFSPVVDVK''RMDN3_S212_KDS*LDLEEEAASGASSALEAGGSSGLEDVLPLLQQADE''RMDN3_S224_KDSLDLEEEAASGAS*SALEAGGSSGLEDVLPLLQQADE''RMDN3_S224_KDSLDLEEEAASGAS*SALEAGGSSGLEDVLPLLQQADE''RMDN3_S233_KDSLDLEEEAASGASSALEAGGSS*GLEDVLPLLQQADE''RMDN3_S233_KDSLDLEEEAASGASSALEAGGSS*GLEDVLPLLQQADE''SAMD1_S427_EGGTASVATGPDSPS*PVPLPPGKPALPGADGTPFGCPP''SCRIB_T1342S1348_AFAAVPTSHPPEDAPAQPPT*PGPAAS*PEQLS''SEC62_S335S341T343_VGPGNHGTEGS*GGERHS*DT*DSDRREDDR''SEPT4_S101S102_PQAPDLYDDDLEFRPPSRPQS*S*DNQQYFCAPAP''SEPT4_S101S102_PQAPDLYDDDLEFRPPSRPQS*S*DNQQYFCAPAP''SEPT4_S102_PQAPDLYDDDLEFRPPSRPQSS*DNQQYFCAPAPLSPSA''SEPT4_S102Y107_PQAPDLYDDDLEFRPPSRPQSS*DNQQY*FCAPAP''SEPT4_S102Y107_PQAPDLYDDDLEFRPPSRPQSS*DNQQY*FCAPAP''SEPT4_S115_PQAPDLYDDDLEFRPPSRPQSSDNQQYFCAPAPLS*PSA''SEPT4_S115_PQVPEPRPQAPDLYDDDLEFRPPSRPQSSDNQQYFCAPA''SEPT4_Y107_PQAPDLYDDDLEFRPPSRPQSSDNQQY*FCAPAPLSPSA''SEPT4_Y107_PQVPEPRPQAPDLYDDDLEFRPPSRPQSSDNQQY*FCAP''SEPT4_Y107S115_PQAPDLYDDDLEFRPPSRPQSSDNQQY*FCAPAPL''SH3BP5L_S43S44_ETPQGELRPEVVEDEVPRSPVAEEPGGGGSSS*S*''SLC39A8_S275S278_ALPAINGVTCYANPAVTEANGHIHFDNVS*VVS''SLX4_S1608S1610_EIFQYTHQTLDS*DS*EDESQSSQPLLQAPHCQT''SP110_S244S248_EDPQEMPHS*PLGS*MPEIRDNSPEPNDPEEPQEV''SP110_S248S256_EDPQEMPHSPLGS*MPEIRDNS*PEPNDPEEPQEV''SPTBN1_S2160S2161S2164_MAETVDTSEMVNGATEQRTS*S*KES*''SPTBN1_S2160S2161S2165_MAETVDTSEMVNGATEQRTS*S*KESS''SPTBN1_S2161S2164S2165_MAETVDTSEMVNGATEQRTSS*KES*S''SPTBN1_T2155S2164S2165_MAETVDTSEMVNGAT*EQRTSSKES*S''SRRM2_S377T384S387_HGGS*PQPLATT*PLS*QEPVNPPSEASPTR''STON2_S258_FPSWVTFDDNEVSCPLPPVTSPLKPNTPPS*ASVIPDVP''SUPT6H_S73S75S78_GFINDDDDEDEGEEDEGS*DS*GDS*EDDVGHK''SYMPK_T1246T1257S1259_EERSPQT*LAPVGEDAMKT*PS*PAAED''THRAP3_S289_PSPPLSSTSQMGSTLPS*GAGYQSGTHQGQFDHGSGSL''THRAP3_S310_PSPPLSSTSQMGSTLPSGAGYQSGTHQGQFDHGSGSLS''TNIK_S324S326_DETEYEYS*GS*EEEEEENDSGEPSSILNLPGESTL''TRIM28_S596_LASPS*GSTSSGLEVVAPEGTSAPGGGPGTLDDSATIC''TRIO_S1952_MALEDRPSSLLVDQGDSSSPSFNPSDNSLLSS*SSPIDE''TRIO_S1954_MALEDRPSSLLVDQGDSSSPSFNPSDNSLLSSSS*PIDE''UBE2O_S87S102S115_LIHGEDS*DSEGEEEGRGSSGCS*EAGGAGHE''VCP_S197_VVETDPSPYCIVAPDTVIHCEGEPIKREDEEES*LNEVGYD''VCP_T180_VVETDPSPYCIVAPDT*VIHCEGEPIKREDEEESLNEVGYD''VPS13D_S1712S1727_EYLSQSCPS*VSNVEYPDMPRSLPS*HMEEAP''VPS13D_S1724S1727_EYLSQSCPSVSNVEYPDMPRS*LPS*HMEEAP''VPS13D_Y1718S1727_EYLSQSCPSVSNVEY*PDMPRSLPS*HMEEAP''WNK1_S1557_VFPSEITDTVAASTAQSPGMNLSHS*ASSLSLQQAFSEL''ZSCAN18_S66S70_AFAS*PRSS*PAPPDLPTPGSAAGVQQEEPETIPE''ZSCAN18_S69S70_AFASPRS*S*PAPPDLPTPGSAAGVQQEEPETIPE''ZSCAN18_T78S81_AFASPRSSPAPPDLPT*PGS*AAGVQQEEPETIPE'
$proteomics
$transcriptomics
So it seems that there are duplicated feature names in those 2 views (acetylproteomics
and phosphoproteomics
) indeed.
This is something that has to be fixed in the data input, I believe.
But we should also consider verifying there are no duplicates before training a model in Python, thanks for letting us know!
Hey @gtca .....Yes I will fix the data input for those two view.....thanks for your response....one peculiar thing I would like to mention is yesterday I had checked for duplicate features in the python environment for the above two dataframes for which python says that there are no duplicate entries whereas using the commands you provided in R environment it was showing that there are some duplicated feature names which is little confusing as to what is making the difference here....Hope this will help.......Thank you !
Hi,
I have trained and saved the model using the python notebooks. I tried to load the saved model into R environment but facing this error. Can you please help me with this. Please let me know if i should upload any more data along with this.
the model should have appended the suffix values if it had found any duplicate features across views.