maese005 / oncoPredict

10 stars 7 forks source link

About the function 'calcPhenotype()' output R^2 is negative #4

Closed gongmeiyuan closed 1 year ago

gongmeiyuan commented 2 years ago

hi, your package is very interesting R package,but when I test my patient data from TCGA, the R^2 output is negative

here is my command:

########step1: setwd("E:/R.lxdata/FerrDb2") set.seed(12345) library(reshape2) library(ggpubr) th=theme(axis.text.x = element_text(angle = 45,vjust = 0.5)) dir='./drug_test/DataFiles/Training Data/' library(oncoPredict) library(data.table) library(gtools)

step2:

CTRP2_Expr = readRDS(file=file.path(dir,'CTRP2_Expr (TPM, log2(x+1) Transformed).rds')) CTRP2_Res = readRDS(file = file.path(dir,"CTRP2_Res.rds"))

load my data

testExpr <- read.table('E:/R.lxdata/FerrDb2/step4_GSEA/Dataset/TCGA_78.csv',sep = ',', header = T,row.names = 1)

testExpr <- log(as.matrix(testExpr)+1)

testExpr[1:4,1:4]

dim(testExpr)

calcPhenotype(trainingExprData = CTRP2_Expr, trainingPtype = CTRP2_Res, testExprData = testExpr, batchCorrect = 'standardize', # "eb" for ComBat
powerTransformPhenotype = TRUE, removeLowVaryingGenes = 0.2, minNumSamples =10, printOutput = TRUE, removeLowVaringGenesFrom = 'rawData', rsq = TRUE, cc = TRUE) ###################################################################### Here is part of the output file (R^2): CIL55 BRD4132 BRD6340 BRD9876 betulinic acid gossypol chlorambucil fluorouracil cimetidine azacitidine trifluoperazine paclitaxel tamoxifen carboplatin teniposide sildenafil simvastatin parbendazole procarbazine curcumin ciclopirox epigallocatechin-3-monogallate myricetin methotrexate lovastatin valdecoxib dacarbazine prochlorperazine ifosfamide doxorubicin ouabain BRD9647 piperlongumine pyrazolanthrone C6-ceramide topotecan importazole etoposide PRIMA-1 tanespimycin blebbistatin cytochalasin B NSC95397 manumycin A mitomycin tacrolimus SB-431542 staurosporine SB-225002 cerulenin purmorphamine GSK-3 inhibitor IX dasatinib erlotinib BRD-K94991378 LBH-589 IC-87114 ciclosporin sirolimus KU-55933 sitagliptin PDMP BRD-K71935468 entinostat BEC apicidin Merck60 BRD-A94377914 FGIN-1-27 compound 1B vincristine cytarabine hydrochloride itraconazole erastin ML031 CIL56 FQI-1 BRD-K92856060 B02 BRD-K45681478 ML050 ML162 CIL41 NSC30930 CIL70 MI-1 DBeQ ML083 CID-5951923 IU1 ML311 imatinib decitabine CHIR-99021 BI-2536 BRD-K61166597 GW-843682X vandetanib sorafenib temozolomide QW-BI-011 SNX-2112 bexarotene nilotinib sunitinib bendamustine CD-437 SCH-79797 LE-135 SKI-II Platin GW-405833 AC55649 RITA LY-2183240 CD-1530 TPCA-1 NSC632839 pifithrin-mu SN-38 BRD-K80183349 BRD-K66532283 MK-2206 triazolothiadiazine BRD-K66453893 BRD-K11533227 BRD-K27224038 BRD-K14844214 BRD1835 BRD-K41597374 BRD-K63431240 BRD-K13999467 BRD-K96970199 brefeldin A AGK-2 L-685458 PF-750 triptolide nutlin-3 16-beta-bromoandrosterone PRL-3 inhibitor I SID 26681509 neuronal differentiation inducer III NSC 74859 bortezomib ABT-737 zebularine PI-103 tubastatin A SR-II-138A neopeltolide parthenolide Compound 7d-cis pevonedistat AZD7545 VER-155008 ML203 SRT-1720 CAY10594 Compound 1541A RG-108 PAC-1 GMX-1778 gemcitabine MST-312 olaparib oligomycin A indisulam BRD-K49290616 BRD-K02492147 austocystin D SJ-172550 pandacostat BRD-K96431673 cyanoquinoline 11 tipifarnib-P2 tipifarnib-P1 Compound 23 citrate nakiterpiosin CCT036477 tamatinib myriocin YM-155 BMS-754807 BIBR-1532 neratinib BRD1812 BRD8958 serdemetan daporinad MK-1775 tivozanib Ko-143 CR-1-31B BRD-K71781559 BRD-K86535717 BRD-K48334597 ML312 BRD-K04800985 BRD-K78574327 BRD-K19103580 BRD-K30019337 BRD-K84807411 BRD-K44224150 BRD-K75293299 BRD-K64610608 BRD-K02251932 BRD-K55116708 BRD-K41334119 BRD-K34485477 BRD-K16147474 BRD-K29086754 BRD-K33199242 BRD-K52037352 BRD-K27986637 BRD-K37390332 fulvestrant HLI 373 NSC48300 PRIMA-1-Met isoevodiamine UNC0638 cucurbitacin I BRD-K50799972 BRD-K17060750 linifanib veliparib saracatinib afatinib canertinib obatoclax masitinib ZSTK474 brivanib GDC-0879 NVP-TAE684 SNS-032 PLX-4720 TGX-221 KU-0063794 SU11274 BRD-K88742110 BRD8899 EX-527 ELCPK NPC-26 "1S,3R-RSL-3" CIL55A SCH-529074 StemRegenin 1 FQI-2 ML210 darinaparsin PX-12 GANT-61 PL-DI TW-37 AZD8055 CHM-1 QS-11 ML239 fumonisin B1 JQ-1 BRD-K51490254 BRD-K85133207 CAY10576 Repligen 136 nintedanib foretinib regorafenib WZ8040 OSI-930 MGCD-265 lenvatinib YK 4-279 CAY10618 KU 0060648 narciclasine MLN2238 ISOX BRD-K29313308 UNC0321 SGX-523 erismodegib alisertib AZD6482 ruxolitinib bleomycin A2 GDC-0941 SB-525334 PHA-793887 quizartinib fingolimod pazopanib BRD-K24690302 BRD-K70511574 PLX-4032 GSK461364 BRD-K20514654 BRD-K28456706 STF-31 spautin-1 BRD-A02303741 ML320 tigecycline BRD-K34099515 PF-543 PF-573228 ceranib-2 FSC231 968 GSK4112 etomoxir HC-067047 IPR-456 leptomycin B AZ-3146 VU0155056 JQ-1:UNC0638 (2:1 mol/mol) vorinostat:carboplatin (1:1 mol/mol) serdemetan:SCH-529074 (1:1 mol/mol) selumetinib:PLX-4032 (8:1 mol/mol) sirolimus:bortezomib (250:1 mol/mol) BRD-K97651142 vorapaxar Ch-55 BRD-K79669418 BRD-K99584050 BRD-A71883111 navitoclax:birinapant (1:1 mol/mol) ISOX:bortezomib (250:1 mol/mol) selumetinib:GDC-0941 (4:1 mol/mol) selumetinib:tretinoin (2:1 mol/mol) selumetinib:vorinostat (8:1 mol/mol) selumetinib:BRD-A02303741 (4:1 mol/mol) tretinoin:navitoclax (4:1 mol/mol) decitabine:navitoclax (2:1 mol/mol) tretinoin:carboplatin (2:1 mol/mol) necrostatin-7 PYR-41 I-BET151 lomeguatrib tanespimycin:gemcitabine (1:1 mol/mol) docetaxel:tanespimycin (2:1 mol/mol) selumetinib:MK-2206 (8:1 mol/mol) navitoclax:PLX-4032 (1:1 mol/mol) selumetinib:piperlongumine (8:1 mol/mol) tanespimycin:bortezomib (250:1 mol/mol) navitoclax:MST-312 (1:1 mol/mol) carboplatin:etoposide (40:17 mol/mol) piperlongumine:MST-312 (1:1 mol/mol) navitoclax:gemcitabine (1:1 mol/mol) JW-480 BMS-195614 BRD-K07442505 BRD-K35604418 SB-743921 GSK525762A navitoclax:pluripotin (1:1 mol/mol) JQ-1:carboplatin (1:1 mol/mol) selumetinib:navitoclax (8:1 mol/mol) selumetinib:decitabine (4:1 mol/mol) SNX-2112:bortezomib (250:1 mol/mol) JQ-1:MK-0752 (1:1 mol/mol) BRD-A02303741:navitoclax (2:1 mol/mol) salermide:PLX-4032 (12:1 mol/mol) UNC0638:navitoclax (1:1 mol/mol) BRD-A02303741:carboplatin (1:1 mol/mol) doxorubicin:navitoclax (2:1 mol/mol) selumetinib:UNC0638 (4:1 mol/mol) alisertib:navitoclax (2:1 mol/mol) navitoclax:sorafenib (1:1 mol/mol) navitoclax:piperlongumine (1:1 mol/mol) SMER-3 NVP-ADW742 clofarabine vorinostat:navitoclax (4:1 mol/mol) XL765 ML258 SR1001 bardoxolone methyl BMS-270394 fluvastatin AZD4547 hyperforin NVP-231 MK-0752 semagacestat bosutinib temsirolimus nelarabine rigosertib PD318088 KU-60019 BIRB-796 KW-2449 RAF265 silmitasertib BRD-K03911514 BRD-K16130065 momelotinib CAL-101 LY-2157299 WP1130 OSI-027 NVP-BSK805 erlotinib:PLX-4032 (2:1 mol/mol) PIK-93 AZD7762 birinapant KX2-391 3-Cl-AHPC BRD-K58730230 BRD-K27188169 KHS101 WAY-362450 JQ-1:vorinostat (2:1 mol/mol) BRD-K13185470 crizotinib:PLX-4032 (2:1 mol/mol) BRD-K27188169:navitoclax (2:1 mol/mol) BRD-K03536150 BRD-K34222889 linsitinib ML006 Bax channel blocker tretinoin isoliquiritigenin NSC23766 N9-isopropylolomoucine pifithrin-alpha BIX-01294 dexamethasone selumetinib ML029 CI-976 necrostatin-1 AA-COCF3 TG-101348 pluripotin BMS-536924 BRD-K09587429 tandutinib TG-100-115 crizotinib lapatinib carboplatin:UNC0638 (2:1 mol/mol) selumetinib:JQ-1 (4:1 mol/mol) decitabine:carboplatin (1:1 mol/mol) JQ-1:navitoclax (2:1 mol/mol) elocalcitol cabozantinib phloretin BRD-K26531177 BMS-345541 cyclophosphamide niclosamide gefitinib vorinostat AM-580 Mdivi-1 axitinib navitoclax Ki8751 oxaliplatin tosedostat barasertib YL54 PF-184 AT7867 PD 153035 tacedinaline NVP-BEZ235 omacetaxine mepesuccinate avrainvillamide BRD-K09344309 isonicotinohydroxamic acid NSC19630 LRRK2-IN-1 KH-CB19 abiraterone KPT185 pitstop2 R428 ibrutinib sotrastaurin A-804598 COL-3 ABT-199 ETP-46464 marinopyrrole A methylstat BYL-719 GSK2636771 JW-74 necrosulfonamide HBX-41108 palmostatin B JW-55 AT-406 salermide WZ4002 ML334 diastereomer GSK1059615 BRD-K55473186 VAF-347 BRD-M00053801 thalidomide MG-132 belinostat O-6-benzylguanine BRD-K90370028 cediranib BRD-K48477130 AZD1480 BRD-A05715709 docetaxel BRD-K99006945 tivantinib trametinib RO4929097 dinaciclib MLN2480 BRD-K51831558 istradefylline MI-2 BRD-K42260513 alvocidib CBB-1007 BRD-A86708339 PF-3758309 dabrafenib SZ4TA2 AT13387 BCL-LZH-4 skepinone-L PF-4800567 hydrochloride avicin D VX-680 BRD-K33514849 BRD-K01737880 bafilomycin A1 SR8278 GSK-J4 BRD9876:MK-1775 (4:1 mol/mol) BRD-K30748066 1 -0.17650787 -0.309287701 -0.729277784 -0.268093373 -0.141374784 -0.16851368 -0.418488065 -0.017531996 -1.226555223 -5.732989175 -0.222119011 -0.30544456 -0.341471979 -0.176932816 -0.367694136 -0.405187851 0.029778706 -6.617510818 -0.011052142 -0.115952549 -0.062722033 0.1008036 -0.007730187 -2.061570228 -0.470991861 -0.609150638 -0.102038716 -1.205140664 -0.98101842 0.292105615 -0.922150565 -0.250714601 0.102557945 -0.143860146 -0.012413989 -0.739383399 -0.143085829 0.261105079 0.098754257 -0.502888166 -0.003221467 -4.163284682 -3.720938341 0.069685169 0.14993463 0.008766443 0.003166874 0.026329798 0.109199663 -0.117668429 0.002299201 -0.99544953 0.066207568 -0.120150462 -0.047178409 0.225021686 0.050701019 -0.310806523 0.249408197 0.01865892 -2.104799069 -2.350167125 -1.927887565 -0.041994975 -15.13606131 -0.955587765 0.15380208 -0.051384274 -0.082554157 -0.510047439 -1.481866652 0.051048065 -0.471364049 -0.147861159 -0.271286667 0.098743572 -1.633229644 -1.569077571 0.158662396 -1.170349856 -0.195231519 -0.829134338 0.088059996 0.012852903 -0.771919636 -0.44526057 0.066111746 -0.043439769 -0.009205214 -0.18271592 0.18031788 -1.399950356 -0.030497712 -0.240060404 -6.059529589 0.072709386 -0.901628045 -0.448857237 -0.022686619 -1.747116928 -0.756625385 -0.787122637 -0.010516537 0.035466142 -0.475304026 0.024859622 0.162932309 0.214681621 -0.285633898 -0.334983667 -0.021168628 -0.369648063 -0.018079871 0.082038833 -0.350082853 -3.504614241 -0.003729459 -0.199525508 -0.329460598 0.142262402 -0.371092769 -0.813702987 -0.46872269 0.400752582 -0.836135645 -0.211555722 -2.014455466 -0.323235421 -0.526918119 0.040347798 0.044600551 -0.62230271 -0.144981065 -0.076972795 -0.002148248 -2.054790852 -0.498295199 -20.35901554 0.163914329 -0.659968793 0.021912409 -1.102402646 0.044189784 -0.178832835 0.066217889 -0.042953276 -0.482816765 -0.492921731 -0.827239278 0.088303493 -0.195583356 -0.421097729 -0.255254206 -0.529401832 -0.779427644 0.086051401 -1.986808733 -0.238647278 -1.000929931 -0.05642947 -1.521486298 -3.148686562 -0.79754303 0.240515526 -0.130914686 -0.20189008 -1.348114976 -0.652498086 0.008882346 -1.017080475 0.136719671 -2.722484844 -0.153878599 -0.962651793 -0.00076885 0.0560997 -0.119995304 -0.165102662 0.064378614 -0.358090342 -0.477558186 -3.257927046 -2.333941753 -0.58349385 -0.048415714 -0.139318181 -0.123269076 -0.008128303 -2.164792154 -0.242939863 -1.760388399 0.011719448 -0.362423836 -0.149292392 -0.618747805 -0.115647865 -12.29871183 -5.143870157 -0.004149291 -0.432210225 -0.042715063 -2.601230864 -0.091150786 -0.29553641 -0.005392546 -2.140221241 -1.157824867 -1.007044717 -0.045402092 -0.692248375 -0.158551473 -0.601657898 -0.695843966 -0.093473804 -3.016244822 -2.416078543 -36.95233262 -0.222681316 -0.825954317 -19.77844993 -0.794282515 -5.428126709 -0.33117769 -6.499371673 -0.336324051 -0.735868446 -3.684483503 -1.691804801 -0.890414221 -0.210974301 -2.566285309 0.009934183 -0.403977767 0.023833476 -0.070383821 -2.075807943 -0.735670418 -0.515452448 -1.483051092 -0.245863133 -0.134618658 -1.118489375 -0.635565087 -0.019134382 -0.441095086 -0.226880924 0.07828816 0.008585791 -0.48135828 -0.100759171 -0.92177455 0.020522763 -0.56505956 0.129844798 -0.158252594 -0.062027844 -0.17615896 -0.778034109 -0.823197555 -4.72922606 -0.626186129 -5.972310909 0.199572205 -0.334877626 -0.168244643 -0.240619415 -0.263087148 -0.142800071 -0.062072602 -0.516070327 -0.002404878 -0.08265327 0.138359252 -0.235158818 0.185460174 -0.163122081 -0.826740321 -0.359191331 -0.477092177 -0.608109659 -3.31248025 -0.861252939 -0.615204715 -0.017715595 -0.00104903 -0.147056202 -0.189309189 -0.099085257 -1.023039719 -0.832468199 -1.390651668 -0.561822023 -0.391413004 -0.084916355 -0.207793421 -0.5070958 -0.378247988 0.215234805 -6.781733871 -1.079475939 0.180268548 -4.475364364 -0.171421877 -0.413313251 0.037216981 -0.622015297 -0.437616235 0.014982192 0.168437132 -3.076921697 -2.201898258 -1.375817872 -0.007053861 -0.027836065 -0.052161694 -0.090866152 -0.753298381 -0.00768607 -3.275312348 0.068566981 -0.421123312 0.135774608 0.096431115 -3.68246607 -1.911932348 0.060325644 -4.145251067 -1.894120196 -0.392568743 -0.398387243 -0.055596244 -1.541504645 0.147773891 -0.25612408 -4.280589335 -9.069645444 -1.378988872 -0.3662683 -0.225423207 -0.191925944 0.160744812 -0.12382869 -0.04693307 -0.430690842 -3.601141271 -1.263697535 -0.977903777 -0.87604981 -1.423507964 0.121108365 -0.275055181 -1.50530771 -0.388547098 -0.121551396 -0.106388235 0.03881836 -0.950024985 0.076642493 -0.912070158 -2.446879132 -0.58657625 -2.605248099 0.199125635 -1.737676279 -0.199941085 -0.029474421 -2.795329826 -0.283172077 -0.283387705 -2.974361018 -2.48708819 -1.576906108 -0.440772 -0.144495307 -0.326447377 0.240545486 -0.625640775 -19.81679737 -0.499793169 -0.300065141 -0.348775982 0.056528371 -0.823017343 -2.441289346 -0.008780824 -2.278343384 -0.967585504 -0.008853877 0.100350343 -0.044211358 -0.756284569 -0.976036051 0.044368601 -0.356219399 0.123536956 -3.732324305 -0.319459288 -0.008048688 -0.019468806 -0.958122557 0.141009955 -3.831889639 -1.067176256 0.037769419 -0.43372941 -0.052091294 0.092986925 -4.293221067 -0.246032255 -1.019120757 0.297410559 -0.025003463 -0.08796411 -1.020461714 -13.18625678 -0.006079317 -0.614193626 -18.32203209 -0.111674871 -0.062757627 -0.307422687 -0.809250638 0.020460267 -1.023743259 -0.032780178 -0.438809564 -0.128554506 0.013786902 0.024048706 -0.817466293 0.301741475 -0.207728144 -1.743605841 -0.045399249 -3.071201328 -0.721316099 -0.283266342 -0.18095738 -7.571217716 -1.016363897 -0.131727978 -0.322884836 -0.309700363 -3.281783642 0.099968966 -1.154460244 -0.373527417 0.157244593 -0.092864945 -2.311010263 -0.144726824 0.099611283 -0.895720072 -1.361146608 -1.953047054 -0.506260515 0.079218832 -0.304954179 -4.793421811 -0.684518623 -0.593735333 -0.349814631 -1.90980277 -2.301902525 0.019474152 0.204055626 -0.448268691 -0.484198695 -2.905491336 -0.264072344 -0.453094431 -0.01878892 -0.236741294 0.14344249 0.007624509 -1.102365259 0.041480139 0.067048719 -0.15143329 -0.136398926 0.0807961 -0.006512746 -1.227880453 -0.924442122 -0.710299681 -1.995018016 -0.427895332 -0.328384185 -0.752122886 -0.260728654 -4.100133533 -0.381677406 -0.106504464 -0.130531056 -0.111700124 -1.027570221 -0.123443026 -0.058231437 -0.022117982 -0.059926351 -0.108898258 -0.079540411 -0.08669017 -0.315274568 0.050726369 -0.902404111 -0.018865945 -0.308428114 0.184516732 -0.003222489 -0.737075683 -0.264797477 -5.150052748 0.203650881 -0.323736195 -0.337533434 -5.086181969 -0.62706206 -0.359194556 -0.302901645 -1.92027752 -0.653687375 0.036602479 -0.595853981 -1.039693269 -0.017965859 0.055873803 -0.464950995 -0.132247496 -3.313796169 -0.031732448 -0.758903441 -0.151108016 -2.768851456 -0.456433719 0.064707928 -0.516874952 -0.228492871 -0.345016455 -2.501694782 -0.097589404 0.014003189 0.269130214 0.393109887 -27.49990054

maese005 commented 2 years ago

Hello,

Thank you for exploring our package!

First, note that our regression model is not based on least square regression. Regularized least squares/RLS is advantageous to us relative to traditional least squares because we have more features than samples (ordinary least squares would be impossible to fit). RLS (like ridge regression) is similar to least squares regression, but it includes an additional constraint (regularization) on the solution. In other words, RLS adds a penalty term to the error, limiting the size of the coefficients in the least squares method.

Second, note that since our regression is not based on least square regression, SST=SSE+SSR does not hold. When an intercept is included in linear regression, the sum of residuals is zero and the sum of the squares of the errors is minimized, and SST (total sum of squares)=SSE (sum of squares error)+SSR (sum of squares due to regression). However, this equation relies on terms derived from least square regression. Since our regression is not based on least square regression, those terms are violated and SST=SSE+SSR does not hold. Those terms include that the regression must include an intercept. When this intercept is not included or if the intercept is incorrectly constrained the R^2 values can dip below 0.

In other words, the regression model is so poor that a constant predictor (like the null hypothesis of a horizontal line with an intercept equal to the mean of the observed data) would fit the observed data better. When this is the case the sum of the squares of the errors exceeds the sum of squares, and R^2 is negative. I am using the equations r^2 = SSR/SST = 1 - (SSE/SST)