andersen-lab / Freyja

Depth-weighted De-Mixing
BSD 2-Clause "Simplified" License
100 stars 29 forks source link

The purpose of providing reference genomes file, and Demixing error #216

Closed Petrichor-sudo closed 4 months ago

Petrichor-sudo commented 4 months ago

Hi, I'm new to this tool and staff about reference genomes etc. and I have several questions.

  1. Could someone plz explain the reason to provide the reference file when using command freyja variants [bamfile] --variants [variant outfile name] --depths [depths outfile name] --ref [reference.fa]? Since the result in the output file from freyja demix [variants-file] [depth-file] --output [output-file] does not seem to be based on the provided reference file.

  2. If there are multiple reference genomes in the reference file, is --refname [reference name] option necessary when using freyja variants [bamfile] --variants [variant outfile name] --depths [depths outfile name] --ref [reference.fa]? I got the following error calling demix command without --refname option at the previous step:

loop of ufunc does not support argument 0 of type Series which has no callable log2 method
Error: Demixing step failed. Returning empty data output 
  1. And if --refname [target_name] is specified, does that mean in the reference file, only the reference genome with 'target_name' is used for generating the variant and depth files and the remaining are kinda discarded?

I can provide the bam file and reference file if needed.

Thank you!

p.s. I'm using the latest version of freyja(1.4.9)

joshuailevy commented 4 months ago

Hi @Petrichor-sudo!

  1. That's specifically for generalizations of Freyja (when a different reference was used, or a non-SC2 pathogen is being analyzed). For standard SC2 workflows, you shouldn't need it.

  2. Yes, if there are multiple reference genomes, you'll find that things work a lot more quickly if you use the --refname argument - and it's possible that the resulting output from variants will include additional information beyond SC2 mutations if you don't use it and there's other things being sequenced.

  3. Correct!

Hopefully that clarifies things a bit- if not, feel free to share your bam/reference and we can take a look!

Josh