artic-network / fieldbioinformatics

The ARTIC field bioinformatics pipeline
MIT License
112 stars 68 forks source link

vcf_filter.py filtering parameters and align_trim.py for alternative primer scheme #114

Open kakuk9 opened 2 years ago

kakuk9 commented 2 years ago

Hi,

I was trying to modify vcf_filter.py and align_trim.py scripts for my amplicon scheme (circular viral genome, 1 amplicon was actually starting from the "end" of the genome to the "start" of the genome. I was wondering I could possibly get some advice - I am not a bioinformatician.

  1. vcf_filter.py: I was wondering why QUAL was divided by total number of reads and it has to be greater or equal to 3? (I read from Nanopolish repository that QUAL in Nanopolish is not well-calibrated). I am trying to add SupportFraction as part of the filtering.

Line 30

if qual / total_reads < 3:
    print("POS: ", v.POS, "qual / total_reads < 3")
    return False`
  1. align_trim.py: Because one of my amplicon was starting from the "end" of the genome to the "start" of the genome (e.g. forward primer positon: 2345-2365, reverse primer positon: 400-425), the align_trim.py script is not happy with my primer bed file. Is that anyway to work around this, I could soft-clip my primers after minimap2 instead of pre-trimming (so Nanopolsh could have the primer site for calibration)?

Many thanks!