cancerit / NanoSeq

Analysis software for Nanorate Sequencing (NanoSeq) experiments
GNU Affero General Public License v3.0
13 stars 8 forks source link

Source of human genome trinucleotide frequencies #78

Closed gevro closed 11 months ago

gevro commented 11 months ago

Hi, For calculation of ratio2genome in the results, what reference version of the human genome are the below counts? It seems closer to hg38 than hg19 from my analysis.

Thanks!

  # human genome trinucleotide frequencies assumed
  genome_counts = vector()
  genome_counts[c("ACA", "ACC", "ACG", "ACT", "ATA", "ATC", "ATG", "ATT", "CCA", "CCC", "CCG", "CCT", "CTA", "CTC", "CTG", "CTT",
                  "GCA", "GCC", "GCG", "GCT", "GTA", "GTC", "GTG", "GTT", "TCA", "TCC", "TCG", "TCT", "TTA", "TTC", "TTG", "TTT")] =
                  c(115415924, 66550070, 14381094, 92058521, 117976329, 76401029, 105094288, 142651503, 105547494, 75238490,
                    15801067, 101628641, 73791042, 96335416, 115950255, 114180747, 82414099, 68090507, 13621251, 80004082,
                    64915540, 54055728, 86012414, 83421918, 112085858, 88336615, 12630597, 126566213, 119020255, 112827451,
                    108406418, 219915599)
fa8sanger commented 11 months ago

I obtained those from the deconstructSigs R package

On 25 Nov 2023, at 20:03, gevro @.***> wrote:

Hi, For calculation of ratio2genome in the results, what reference version of the human genome are the below counts? It seems closer to hg38 than hg19 from my analysis.

Thanks!

human genome trinucleotide frequencies assumed

genome_counts = vector() genome_counts[c("ACA", "ACC", "ACG", "ACT", "ATA", "ATC", "ATG", "ATT", "CCA", "CCC", "CCG", "CCT", "CTA", "CTC", "CTG", "CTT", "GCA", "GCC", "GCG", "GCT", "GTA", "GTC", "GTG", "GTT", "TCA", "TCC", "TCG", "TCT", "TTA", "TTC", "TTG", "TTT")] = c(115415924, 66550070, 14381094, 92058521, 117976329, 76401029, 105094288, 142651503, 105547494, 75238490, 15801067, 101628641, 73791042, 96335416, 115950255, 114180747, 82414099, 68090507, 13621251, 80004082, 64915540, 54055728, 86012414, 83421918, 112085858, 88336615, 12630597, 126566213, 119020255, 112827451, 108406418, 219915599)

— Reply to this email directly, view it on GitHub [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cancerit_NanoSeq_issues_78&d=DwMCaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=v9-R7fUmjpv-9Zaqyk1nlnlOC3qPkTEJz5tyYxg2uec&m=E4xy67bmiSGclWqwpufiFSPNhgXyk4wei9Kimss5PRSMezFJeqahQ84Zt3Wgfm8o&s=loZ1NvzvN3TXhkbIrO92vjJqXJHpdWMOk4MqWKDkZbU&e=, or unsubscribe [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADNUT3KPQF4WS6XPGGAE2H3YGJFHZAVCNFSM6AAAAAA72JYYBGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYTANZYGEZDCNA&d=DwMCaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=v9-R7fUmjpv-9Zaqyk1nlnlOC3qPkTEJz5tyYxg2uec&m=E4xy67bmiSGclWqwpufiFSPNhgXyk4wei9Kimss5PRSMezFJeqahQ84Zt3Wgfm8o&s=sDSfmlEGSfkHbl4aIvM9mskPLNGc9N0wOd6mVpwkF3M&e=. You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA