mskcc / facets

Algorithm to implement Fraction and Copy number Estimate from Tumor/normal Sequencing.
139 stars 65 forks source link

Calling CNA from Cat genome; custom #188

Open makarov-ccf opened 1 year ago

makarov-ccf commented 1 year ago


I am trying to call CNA from Cat (Genome assembly: Felis_catus_9.0 (GCA_000181335.4)) I see that for all species except human and mouse, we need to provide GCcontent file. I have created it as per author's recommendations with package The resulting file is Felisgcpct.rda I pass full path to it as ugcpct argument ugcpct=Felisgcpct.rda, but have an error message

Loading required package: pctGCdata Error in 1:nchr : result would be too long a vector Calls: preProcSample -> counts2logROR In addition: Warning message: In max(out$chrom) : no non-missing arguments to max; returning -Inf Execution halted

The genome file is attached

Thank you

veseshan commented 1 year ago

From direct messages with @makarov-ccf we found out that:

…cats have three large metacentric chromosomes (A1 to A3), four large subtelomeric chromosomes (B1 to B4), two medium-size metacentrics (C1 and C2), four small subtelomerics (D1 to D4), three small metacentrics (E1 to E3), and two small acrocentrics (F1 and F2). Ref:

Internally facets uses chromosomes as labeled 1:(nX-1) and "X" (nX = number of autosomes + 1) Mapping A[1-3] as 1:3, B[1-4] as 4:7, C[1-2] as 8:9, D[1-4] as 10:13, E[1-3] as 14:16 and F[1-2] as 17:18 can work.

makarov-ccf commented 1 year ago

Thank you for your input, it worked. One thing to note, I had to edit the VCF file I downloaded from and exclude all extra chromosomes leaving only A1, A2, A3, B1, B2, B3, B4, C1, C2, D1, D2, D3, D4, E1, E2, E3, F1, F2 (in that order), otherwise snp-pileup crushed after chromosome A3 when it encountered contigs like AANG04003642.1. Then I had to follow the steps:

Replace chromosomes as the author suggested. The resulting pileup files look like: Chromosome,Position,Ref,Alt,File1R,File1A,File1E,File1D,File2R,File2A,File2E,File2D 1,55308,T,C,0,23,0,24,0,25,0,20 1,73611,A,T,14,40,0,0,0,22,0,0 1,73739,A,G,6,16,0,0,0,15,0,0 1,73817,G,A,5,53,0,0,0,40,0,0

Run facets as usually according to author's instructions at

Replace Chromosomes names from 1,2,3... back to A1, A2, A3... in CN reports

As a reminder, I had to generate GC content file required by FACETS to call CNA in hon-human species was crated according to author's recommendations with package (One time procedure)

veseshan commented 1 year ago

Thanks @makarov-ccf for the detailed instructions.