Open magnusarntzen opened 11 months ago
Thanks! Great idea! MCFs are definitely easier to interpret than p-values for GSEA. That R package looks neat, but the KO calls are already made in the kegg_diamond rule so we just need the table and algorithm that links the KOs to pathways and computes the MCF, then we're there! I'll look into a way of integrating that.
Hey, The R-package MetQy does not do the KO calling so it is good you have another program that does that for you. I use KoFamScan in my pipelines but I am sure kegg_diamond does the trick too.
MetQy takes a dataframe with semicolon-separated KOs per bin: Bin1 K00001;K00032;K24233 Bin2 K22001;K32231 Etc.
NB: these are lists of gene K-numbers, not pathway KO-numbers.
It uses about 10-15 minutes for 150 bins on my laptop but will be fast on the Threadripper I suppose.
-M
From: Carl Mathias Kobel @.> Sent: onsdag 11. oktober 2023 16:29 To: cmkobel/assemblycomparator2 @.> Cc: Magnus Øverlie Arntzen @.>; Author @.> Subject: Re: [cmkobel/assemblycomparator2] Feature request (Issue #62)
Thanks! Great idea! MCFs are definitely easier to interprete than p-values for GSEA. That R package looks neat, but the KO calls are already made in the kegg_diamond rule so we just need the table and algorithm that links the KOs to pathways and computes the MCF, then we're there! I'll look into a way of solving that.
— Reply to this email directly, view it on GitHubhttps://github.com/cmkobel/assemblycomparator2/issues/62#issuecomment-1757817856, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIFICYTE4PMN4DXRGPNZWL3X62ULHANCNFSM6AAAAAA534632M. You are receiving this because you authored the thread.Message ID: @.**@.>>
Since you asked for feedback...
What about implementing calculations of module competion factors (mcf)? These are values between 0-1 indicating whether a Bin has the required genes to complete a given reaksjon, e.g., 'denitrification' or 'methanogenesis'.
This can be done with the MetQy package in R (I have code if you want) and it would complement your output nicely. I attach an example output for some of my samples with 150 bins. MetQy_mcf.pdf