BrentLab / callvariants

A variant calling workflow suitable for both checking genotypes and processing data for BSA experiments
MIT License
0 stars 0 forks source link

notes on kn99 variant calling in recombinants #4

Open cmatKhan opened 7 months ago

cmatKhan commented 7 months ago
  1. region CP022323.1:983,532-1,047,672 in DA163 has a CNV
  2. This sample also has a good deal of long inserts
  3. After repeat masking, are either the CNV or the long inserts affect?
  4. With long insert reads, where are the ends (in current sample)
  5. make sure in GATK that soft clipped bases are not considered
  6. look into BWA settings re: soft clipping -- turn off entirely, or decrease the amount of soft clipping allowed
  7. pull out heterozygous regions between homozygous regions -- are they CNV compared to their context?
cmatKhan commented 7 months ago
  1. Centromere annotations
cmatKhan commented 7 months ago

completely remove reads with soft clipping more 10

cmatKhan commented 7 months ago

inserts -- count left/right separately (are there regions where the left and right are aligning? or are they relatively distributed)

take the insert as the region, and count frequency over the entire chromosome (so any overlapping gap adds a count)

extract large insert reads and see if there is anything unusual about those sequences

cmatKhan commented 7 months ago
  1. reviewed more singletons after removing double counting
  2. repeatmasking
  3. heterozygous region (counts of heterozygous regions over tiles, and then also ratio of depth to surrounding regions). Make sure to include multimap reads in surrounding region calls and in the heterozygous region itself