seqan / iGenVar

The official repository for the iGenVar project.
BSD 3-Clause "New" or "Revised" License
9 stars 8 forks source link

SNP and Indel calling #227

Closed joergi-w closed 1 year ago

joergi-w commented 2 years ago

The first commit extends the junction class by two members: (1) The "deleted sequence" contains the part of the reference sequence that is deleted in the reads. (2) The "quality" is a float value to be passed to the vcf file and represents how well a SNP or Indel is represented in the read data. I implemented a new constructor with the new members next to the existing one, because it is a less invasive change than editing all the existing constructor calls.

The second commit adds the snp_indel method to iGenVar. It is enabled by default and used if a genome file and short reads are provided. We use the de Bruijn graph to compute haplotype sequences from the reads in a high-activity region. The haplotypes are in turn aligned against the reference, and SNPs and indels are extracted from the alignment and printed with the vcf writer. Currently, they are appended to the end of the existing vcf file; however, we can think about a different way in the future.

codecov[bot] commented 2 years ago

Codecov Report

Merging #227 (63d9733) into master (23112df) will increase coverage by 0.01%. The diff coverage is 97.17%.

@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
+ Coverage   98.47%   98.48%   +0.01%     
==========================================
  Files          20       20              
  Lines         985     1124     +139     
==========================================
+ Hits          970     1107     +137     
- Misses         15       17       +2     
Impacted Files Coverage Δ
src/variant_detection/method_enums.cpp 100.00% <ø> (ø)
src/variant_detection/variant_detection.cpp 95.12% <50.00%> (-1.52%) :arrow_down:
src/variant_detection/snp_indel_detection.cpp 97.56% <97.72%> (-0.86%) :arrow_down:
include/structures/junction.hpp 96.66% <100.00%> (+0.83%) :arrow_up:
src/iGenVar.cpp 100.00% <100.00%> (ø)
src/structures/junction.cpp 100.00% <100.00%> (+8.00%) :arrow_up:
src/variant_detection/variant_output.cpp 100.00% <100.00%> (ø)
src/structures/debruijn_graph.cpp 100.00% <0.00%> (+0.92%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us.

joergi-w commented 2 years ago

The failing CI is a set-up problem with gcc 11 on Mac, and apparently the responsible script is /lib/seqan3/.github/workflows/scripts/install_via_brew.sh in the SeqAn repository.