andersen-lab / Freyja

Depth-weighted De-Mixing
BSD 2-Clause "Simplified" License
100 stars 29 forks source link

Using --depthcutoff for some recent high coverage genomes #237

Closed owenssm closed 3 weeks ago

owenssm commented 2 months ago

We've recently seen quite a few samples with good genome coverage ~90% (or higher) fail to receive variant calls when using freyja demix (freyja demix --barcodes $(BARCODES) --output output/${sample}.out variants/${sample}.variants.tsv depth/${sample}.depth) so we needed to use the --depthcutoff command to get results. We suspect this is related to the addition of quite a few similar JN.1 variants to the barcode file that are tough to distinguish. When we looked at the read depth at some of the key variable JN.1 mutation positions, it seems there is sufficient coverage to distinguish. I'm hoping you can help me understand better what is going on. Thanks! 74616demix.zip

joshuailevy commented 1 month ago

Hey @owenssm,

That's interesting-- thanks for sharing some example data! As you mentioned, it looks like there's a lack of coverage in the specific regions of the genome needed to distinguish some of the JN.1 sublineages. Taking a look at the collapsed_lineages.yml file that is returned when running demix with the --depthcutoff option, it appears the specific mutations needed to differentiate JN.1 from JN.1.10 and JN.1.4 are not present (for the specific mutations, I'd recommend checking this vignette out: https://andersen-lab.github.io/Freyja/src/wiki/lineage_barcode_extract.html).

In the collapsed_lineages.yml file, this is denoted as

JN.1-like:
- JN.1
- JN.1.10
- JN.1.4

Similarly, it looks like JN.1.32 and JN.1.4.3 are indistinguishable given the available coverage and get grouped into a separate JN.1-like(2) group. In these cases, it's clear that you're looking at a JN.1 descendant, it's just unclear which one.

Best, Josh