chrisquince / STRONG

Strain Resolution ON Graphs
MIT License
44 stars 9 forks source link

Compare STRONG and DESMAN results #124

Open cifuj opened 2 years ago

cifuj commented 2 years ago

Hi, I have a question regarding the results I obtained after running the complete STRONG pipeline. I have 12 samples from the same individual. The diversity is rather low, so I have only one bin shared across all samples.

If I understand the outputs correctly, the number of strains identified by STRONG is not the same as the one identified by DESMAN. The STRONG output says that I have one haplotype from 1 MAG while DESMAN indicates in the best_run.txt file that I have five haplotypes (4 reliable). Are the results comparable directly? For example, are the genes used in DESMAN the same as STRONG used to identify variants in the core COGs in the assembly graphs? Is there something I should pay attention to in the config.yaml file before comparing these results?

Haplotype tree obtained in STRONG haplotypes_tree.pdf Number of strains - G from DESMAN Deviance.pdf

Best Jeronimo

chrisquince commented 2 years ago

Hi, The results are comparable in that they are both using the variation on the core genes. The DESMAN results definitely suggest variation is there but it may be noisy. In general the STRONG strain numbers are more reliable but there is a situation with a lot of strains where the cross validation used can be too conservative this is easy to check for. Firstly you can look at the strain numbers prior to cross validation. Go into the bayespaths/Bin*/Bin*_PostUFilter directory, how many strains are there?

You can also look directly at the COG subgraphs COG0532 is a good one, found here:

subgraphs/binmerged/Bin*/simplif/

Examine COG0532.gfa in bandage - is there strain variation?

Best, Chris

cifuj commented 2 years ago

Hi, In the bayespaths/Bin_/Bin__PostUFilter folder there is only 1 haplotype. However, it seems that there is some strain variation in the assembly graphs as there is more than one path in the COG532 assembly graph. bin_16_subgraphs_simplif_COG0532

COG532 is not present in the results/Bin_*/graph/joined_SCG_graph.gfa file.

Best, Jeronimo

StickHu commented 1 year ago

Hi, In the bayespaths/Bin_/Bin__PostUFilter folder there is only 1 haplotype. However, it seems that there is some strain variation in the assembly graphs as there is more than one path in the COG532 assembly graph. bin_16_subgraphs_simplif_COG0532

COG532 is not present in the results/Bin_*/graph/joined_SCG_graph.gfa file.

Best, Jeronimo

Hi, I am stuck in the protocol running the DESMAN and STRONG. Can you share the protocol used in your study? Thank you very much