YichenWang1 / small_bowel

MIT License
6 stars 0 forks source link

Bugs reported when running the signature code #1

Open xieduo7 opened 1 year ago

xieduo7 commented 1 year ago

Hi @YichenWang1 ,

I am recapitulating your code to test the code in my HPC. But it reports bug:

Run citation('hdp') for citation instructions,
    and file.show(system.file('LICENSE', package='hdp')) for license details.
Registered S3 method overwritten by 'ggtree':
  method      from
  identify.gg ggfun
ggtree v3.2.1  For help: https://yulab-smu.top/treedata-book/

If you use ggtree in published research, please cite the most appropriate paper(s):

1. Guangchuang Yu. Using ggtree to visualize data on tree-like structures. Current Protocols in Bioinformatics. 2020, 69:e96. doi:10.1002/cpbi.96
2. Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution. 2018, 35(12):3041-3043. doi:10.1093/molbev/msy194
3. Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution. 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

Loading required package: SnowballC
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

[1] "OK"
            SBS2       SBS3        SBS4        SBS5        SBS6       SBS7a
[1,] 5.80000e-07 0.02080832 0.042196498 0.011997600 0.000425233 0.000067000
[2,] 1.48004e-04 0.01650660 0.033297236 0.009438112 0.000524287 0.000179116
[3,] 5.23000e-05 0.00175070 0.015598705 0.001849630 0.000052000 0.000071200
[4,] 9.78000e-05 0.01220488 0.029497552 0.006608678 0.000180099 0.000248161
[5,] 2.23000e-16 0.01970788 0.006889428 0.010097980 0.000471258 0.000064900
[6,] 1.33004e-04 0.01170468 0.002839764 0.005698860 0.000289158 0.000021600
           SBS7b       SBS7c       SBS7d        SBS8        SBS9       SBS10a
[1,] 0.002329386 0.004830406 0.000040400 0.044098218 0.000557954 2.190170e-03
[2,] 0.000460879 0.001150097 0.000764883 0.047798069 0.004089661 1.770137e-03
[3,] 0.000185951 0.000377032 0.000249962 0.004619813 0.000425965 1.500120e-04
[4,] 0.000709813 0.001960165 0.004049378 0.046998101 0.003049747 1.700132e-02
[5,] 0.000008560 0.001120094 0.001179819 0.004329825 0.004839598 2.230000e-16
[6,] 0.000188950 0.000319027 0.002549609 0.002949881 0.001959837 2.430000e-05
          SBS10b      SBS10c      SBS10d       SBS11       SBS12       SBS13
[1,] 0.000181997 0.004331453 0.010113971 0.000146208 0.004520592 0.001819920
[2,] 0.006539908 0.014830220 0.018446062 0.000552786 0.001130148 0.000720968
[3,] 0.000053500 0.000657116 0.000726535 0.000094200 0.000540071 0.000263988
[4,] 0.000016300 0.013127763 0.014196655 0.000266379 0.001220160 0.000347985
[5,] 0.000304996 0.000347634 0.000128897 0.000170242 0.001220160 0.003869830
[6,] 0.000133998 0.000325131 0.000292096 0.000290413 0.000642084 0.000950958
            SBS14       SBS15       SBS16      SBS17a      SBS17b       SBS18
[1,] 1.120492e-03 0.000944202 0.015995037 0.002070261 0.000607958 0.051533859
[2,] 1.310575e-02 0.000497106 0.002899100 0.000918116 0.000128991 0.015810387
[3,] 4.131810e-04 0.000046100 0.001019684 0.000047600 0.000058200 0.002431598
[4,] 8.263628e-02 0.001110238 0.010596712 0.000061800 0.000455969 0.021414070
[5,] 2.220000e-16 0.000113024 0.001739460 0.001010128 0.000145990 0.001731137
[6,] 5.780000e-06 0.000269058 0.002009376 0.000569072 0.000043400 0.002591703
           SBS19       SBS20       SBS21      SBS22a      SBS22b       SBS23
[1,] 0.001269382 0.000619283 0.000156993 0.006046172 0.006086541 8.35523e-04
[2,] 0.000640688 0.001390636 0.002359889 0.000094500 0.003802506 3.99250e-04
[3,] 0.000245880 0.000021800 0.000293986 0.000790362 0.000980017 9.86000e-08
[4,] 0.000570722 0.001240568 0.000620971 0.001729623 0.003566008 5.61000e-18
[5,] 0.003188447 0.008774017 0.000016900 0.001669991 0.004179338 5.74000e-18
[6,] 0.001789129 0.001510691 0.000718966 0.000193197 0.002712217 4.08000e-05
           SBS24       SBS25       SBS26       SBS27       SBS28       SBS29
[1,] 0.036413400 0.009898466 0.000873118 0.005207818 0.000783905 0.063209264
[2,] 0.026509756 0.006998915 0.000528071 0.004738015 0.002529694 0.051207505
[3,] 0.014805448 0.001449775 0.000114015 0.000782672 0.000352957 0.019202815
[4,] 0.022008099 0.004969230 0.000619084 0.002718861 0.003969520 0.035605218
[5,] 0.003341230 0.008028756 0.000427058 0.001319447 0.000153981 0.003140460
[6,] 0.003671351 0.001639746 0.000378051 0.001769259 0.000223973 0.002630385
           SBS30       SBS31       SBS32       SBS33       SBS34       SBS35
[1,] 0.001799681 0.009534985 0.022297772 0.003111360 0.004867949 0.008826949
[2,] 0.000505910 0.018490274 0.018798122 0.002230975 0.006957070 0.046184032
[3,] 0.000091300 0.001659127 0.004459554 0.000414181 0.000052000 0.001389520
[4,] 0.000555901 0.006276698 0.014598542 0.001900831 0.001229482 0.021592534
[5,] 0.001069810 0.008315626 0.001989801 0.001410616 0.000761679 0.003578763
[6,] 0.000466917 0.003158339 0.002029797 0.001300568 0.001129524 0.004548428
           SBS36       SBS37       SBS38       SBS39      SBS40a      SBS40b
[1,] 0.025193777 0.003950822 0.012794869 0.011701521 0.036395305 0.014007124
[2,] 0.008317945 0.001450302 0.010095952 0.007150930 0.016772493 0.004719804
[3,] 0.002239447 0.001060221 0.001899238 0.002670347 0.003747968 0.001139655
[4,] 0.017895580 0.001850385 0.008846453 0.007400962 0.015434802 0.004952959
[5,] 0.001839546 0.034307136 0.002688922 0.047306150 0.008213432 0.007453156
[6,] 0.001669588 0.011202330 0.001579367 0.022502925 0.005620372 0.010213700
          SBS40c       SBS41       SBS42       SBS43        SBS44       SBS45
[1,] 0.003757760 0.002110193 0.001160252 0.029506295 7.680000e-18 0.009108136
[2,] 0.006283197 0.001220112 0.020604480 0.005901259 1.500380e-04 0.002849417
[3,] 0.002019124 0.000061400 0.000033400 0.000626134 9.160000e-07 0.001659660
[4,] 0.008175782 0.001330122 0.007971733 0.002070442 5.781465e-03 0.009638027
[5,] 0.008893380 0.005330488 0.000406088 0.000361077 3.180806e-03 0.003109364
[6,] 0.004322112 0.003160289 0.000506110 0.000031300 2.330590e-04 0.001929605
      SBS46       SBS47       SBS48       SBS49       SBS50       SBS51
[1,] 0.0044 0.067787870 0.000855468 0.025116633 0.119000274 0.140963913
[2,] 0.0047 0.029694686 0.000841460 0.007034659 0.127000292 0.001689567
[3,] 0.0003 0.001379753 0.000258141 0.007114712 0.005770013 0.004798772
[4,] 0.0034 0.011897871 0.000038900 0.007975282 0.083200191 0.021194574
[5,] 0.0012 0.001029816 0.000198108 0.000288191 0.003850009 0.003958986
[6,] 0.0015 0.000616890 0.000114062 0.000135089 0.001100003 0.004538838
           SBS52       SBS53       SBS54       SBS55       SBS56       SBS57
[1,] 0.015196277 0.005383510 0.002160358 0.005881127 0.012597481 0.012306325
[2,] 0.006538398 0.001961279 0.000796132 0.002050393 0.015696861 0.001430735
[3,] 0.004138986 0.038124854 0.001640271 0.000044900 0.000205959 0.003331713
[4,] 0.009237737 0.000017100 0.000414069 0.001170224 0.022995401 0.008644443
[5,] 0.001729576 0.007034586 0.001550257 0.002850546 0.000417916 0.015407920
[6,] 0.001519628 0.005463562 0.000496082 0.000168032 0.000165967 0.003131610
           SBS58       SBS59        SBS60       SBS84       SBS85       SBS86
[1,] 0.058874618 0.003588208 6.152708e-03 0.003471995 0.006080257 0.002954301
[2,] 0.006747091 0.002368817 7.793430e-04 0.005007309 0.000879937 0.003774501
[3,] 0.000824644 0.000141929 2.230000e-16 0.000452642 0.000306011 0.000385490
[4,] 0.005577595 0.014492762 4.441950e-04 0.009295947 0.002717279 0.003624001
[5,] 0.020591123 0.002938533 3.771660e-04 0.006758076 0.007234927 0.052516013
[6,] 0.001739250 0.002438782 5.882590e-04 0.010811956 0.002551355 0.078321019
           SBS87        SBS88       SBS89       SBS90       SBS91      SBS92
[1,] 0.008972551 1.000000e-18 0.032168854 0.002201650 0.002944550 0.01132934
[2,] 0.004572575 1.000000e-18 0.017693728 0.000708223 0.052997084 0.00974473
[3,] 0.006208566 1.000000e-18 0.009670834 0.000138566 0.000204200 0.00469657
[4,] 0.004957473 1.731102e-03 0.020743819 0.001755161 0.000130810 0.00775798
[5,] 0.007866457 1.000000e-18 0.014817169 0.000508073 0.000243319 0.00305586
[6,] 0.003933979 4.039238e-03 0.005883213 0.000277130 0.000124574 0.00168986
           SBS93       SBS94       SBS95       SBS96       SBS97       SBS98
[1,] 0.011573228 0.015579670 0.038408380 0.011430605 0.008159180 0.008804153
[2,] 0.008096344 0.024746217 0.017383837 0.007770735 0.007512896 0.006768714
[3,] 0.001760621 0.001574271 0.008360147 0.001997198 0.001374676 0.032959350
[4,] 0.008420819 0.011076145 0.023294307 0.011809258 0.006858397 0.003758009
[5,] 0.008856798 0.007003872 0.003616709 0.009294594 0.002346944 0.006896898
[6,] 0.005418101 0.003990091 0.002806617 0.002532081 0.002221671 0.003265122
           SBS99
[1,] 0.000003820
[2,] 0.000006960
[3,] 0.000051400
[4,] 0.000030700
[5,] 0.000446027
[6,] 0.012173080
 [1] "0[A[C>A]A]0" "0[A[C>A]C]0" "0[A[C>A]G]0" "0[A[C>A]T]0" "0[A[C>G]A]0"
 [6] "0[A[C>G]C]0" "0[A[C>G]G]0" "0[A[C>G]T]0" "0[A[C>T]A]0" "0[A[C>T]C]0"
[11] "0[A[C>T]G]3" "0[A[C>T]T]0" "0[A[T>A]A]0" "0[A[T>A]C]0" "0[A[T>A]G]0"
[16] "0[A[T>A]T]0" "0[A[T>C]A]0" "0[A[T>C]C]0" "0[A[T>C]G]0" "0[A[T>C]T]0"
[21] "0[A[T>G]A]0" "0[A[T>G]C]0" "0[A[T>G]G]0" "0[A[T>G]T]0" "0[C[C>A]A]0"
[26] "0[C[C>A]C]0" "9[C[C>A]G]3" "2[C[C>A]T]2" "2[C[C>G]A]4" "7[C[C>G]C]6"
[31] "0[C[C>G]G]0" "2[C[C>G]T]2" "0[C[C>T]A]0" "0[C[C>T]C]0" "0[C[C>T]G]1"
[36] "0[C[C>T]T]0" "4[C[T>A]A]3" "0[C[T>A]C]0" "0[C[T>A]G]0" "0[C[T>A]T]0"
[41] "2[C[T>C]A]2" "0[C[T>C]C]0" "0[C[T>C]G]0" "4[C[T>C]T]2" "3[C[T>G]A]5"
[46] "0[C[T>G]C]0" "0[C[T>G]G]0" "0[C[T>G]T]0" "0[G[C>A]A]0" "0[G[C>A]C]0"
[51] "0[G[C>A]G]0" "2[G[C>A]T]2" "9[G[C>G]A]5" "0[G[C>G]C]0" "0[G[C>G]G]0"
[56] "2[G[C>G]T]2" "0[G[C>T]A]0" "9[G[C>T]C]2" "0[G[C>T]G]2" "3[G[C>T]T]8"
[61] "8[G[T>A]A]1" "0[G[T>A]C]0" "0[G[T>A]G]0" "0[G[T>A]T]0" "0[G[T>C]A]0"
[66] "0[G[T>C]C]0" "0[G[T>C]G]0" "7[G[T>C]T]1" "2[G[T>G]A]2" "2[G[T>G]C]2"
[71] "0[G[T>G]G]0" "1[G[T>G]T]4" "6[T[C>A]A]5" "0[T[C>A]C]0" "2[T[C>A]G]2"
[76] "5[T[C>A]T]8" "2[T[C>G]A]2" "0[T[C>G]C]0" "6[T[C>G]G]9" "0[T[C>G]T]0"
[81] "0[T[C>T]A]0" "3[T[C>T]C]7" "0[T[C>T]G]1" "2[T[C>T]T]2" "0[T[T>A]A]0"
[86] "2[T[T>A]C]2" "2[T[T>A]G]8" "0[T[T>A]T]0" "0[T[T>C]A]0" "0[T[T>C]C]0"
[91] "0[T[T>C]G]0" "0[T[T>C]T]0" "2[T[T>G]A]2" "5[T[T>G]C]5" "0[T[T>G]G]0"
[96] "2[T[T>G]T]2"
[1] "OK"
[1] "OK3"
[1] "OK5"
[1] 415
[1] 420
Error in dimnames(x) <- dn :
  length of 'dimnames' [1] not equal to array extent
Calls: rownames<-
Execution halted

My code:

library(hdp)
library(RColorBrewer)
library(ggtree)
library(lsa)
library(lattice)

chlist <- vector("list", 10)
for (i in 1:10){
  chlist[[i]] <- readRDS(paste0("./chlist_noprior_without_brunners_glands_",i,".rds"))
  }

mut_example_multi <- hdp_multi_chain(chlist)

lapply(chains(mut_example_multi), plot_lik, bty="L", start = 1000)

lapply(chains(mut_example_multi), plot_numcluster, bty="L")
lapply(chains(mut_example_multi), plot_data_assigned, bty="L")
mut_example_multi <- hdp_extract_components(mut_example_multi)
plot_comp_size(mut_example_multi, bty="L")
sub_vec = c("C>A","C>G","C>T","T>A","T>C","T>G")
ctx_vec = paste(rep(c("A","C","G","T"),each=4),rep(c("A","C","G","T"),times=4),sep="-")
full_vec = paste(rep(sub_vec,each=16),rep(ctx_vec,times=6),sep=",")
trinuc_context <- full_vec
group_factor <- as.factor(rep(c("C>A", "C>G", "C>T", "T>A", "T>C", "T>G"),
                              each=16))

mut_colours <- c(RColorBrewer::brewer.pal(10, 'Paired')[seq(1,10,2)], 'grey70')

plot_comp_distn(mut_example_multi, cat_names=trinuc_context,
                grouping=group_factor, col=mut_colours,
                col_nonsig="grey80", show_group_labels=TRUE)

# Load reference signatures
ref=read.csv("COSMIC_v3.4_SBS_GRCh37.txt", header=T, stringsAsFactors = F, sep='\t')
features<-lapply(1:dim(ref)[1],function(idx){
  paste(unlist(strsplit(as.character(ref[idx,2]),""))[1],"[",as.character(ref[idx,1]),"]",unlist(strsplit(as.character(ref[idx,2]),""))[3],sep = "")
})
features<-unlist(features)
ref<-ref[,-1]
ref<-ref[,-1]
rownames(ref)<-features
print('OK')
ref=as.data.frame(ref)

ref<-apply(ref,2,as.numeric)
ref[is.na(ref)]=0
print(head(ref))
print(features)

rownames(ref)<-features
mut.cols = rep(c("dodgerblue","black","red","grey70","olivedrab3","plum2"),each=16)

# Load HDP signatures
hdp_exposures=mut_example_multi@comp_dp_distn[["mean"]][2:416,]
hdp_sigs=t(mut_example_multi@comp_categ_distn[["mean"]][2:10,])
print("OK")
rownames(hdp_sigs)<-features

print("OK3")
input_for_hdp<-read.table('./sbs_mapped_to_branches.txt',check.names = F)
input_for_hdp=input_for_hdp[apply(input_for_hdp,1,sum)>50,]

print("OK5")
print(nrow(hdp_exposures))
print(nrow(input_for_hdp))
rownames(hdp_exposures)=rownames(input_for_hdp)
print("OK7")
colnames(hdp_sigs) = paste0('Signature',c(1:9))
print("OK6")
rownames(hdp_sigs) = rownames(ref)

print("OK4")
write.table(hdp_sigs,"./HDP_results/hdp_sigs.txt",quote=F)
write.table(hdp_exposures,"./HDP_results/hdp_exposures.txt",quote=F)

# Assess cosine similarities for all reference signatures
cosine_matrix=data.frame(matrix(nrow=ncol(hdp_sigs), ncol=ncol(ref)))
rownames(cosine_matrix)=colnames(hdp_sigs)
colnames(cosine_matrix)=colnames(ref)

for (n in 1:nrow(cosine_matrix)) {
  for (m in 1:ncol(cosine_matrix)) {
    cosine_matrix[n,m] <- cosine(x=hdp_sigs[,rownames(cosine_matrix)[n]],
                                 y=ref[,colnames(cosine_matrix)[m]])
  }
}

write.table(cosine_matrix, "./sigs/Cosine_similarities.txt",sep="\t",quote=F)

# Plot output
color.palette = colorRampPalette(c("white", "orange", "purple"))
levelplot(t(cosine_matrix[dim(cosine_matrix)[1]:1,]),col.regions=color.palette, aspect="fill", scales=list(x=list(rot=90)))

I the located the line of question:

rownames(hdp_exposures)=rownames(input_for_hdp)

That is because the hdp_exposures has 415 lines and input_for_hdp has 420 lines. input_for_hdp has been processed by

input_for_hdp=input_for_hdp[apply(input_for_hdp,1,sum)>50,]

And hdp_exposures has been processed by many lines.

Did I made something wrong?

Thanks!

Best, Duo

xieduo7 commented 1 year ago

After I changed this line

# Load HDP signatures
hdp_exposures=mut_example_multi@comp_dp_distn[["mean"]][2:416,]

to

# Load HDP signatures
hdp_exposures=mut_example_multi@comp_dp_distn[["mean"]][2:421,]

The code works well.