Closed amy-houseman closed 1 year ago
Hi @amy-houseman
For short reads, DeepVariant does a local realignment to improve our Indel accuracy. Therefore, the reads the DeepVariant is seeing might have different positions from the original BAM.
Please see this section of the FAQ for more explanation: https://github.com/google/deepvariant/blob/r1.5/docs/FAQ.md#why-does-deepvariant-not-call-a-specific-variant-in-my-data
Hopefully this answers your question @amy-houseman . If not, feel free to reopen!
Hi Amy,
As Pi-Chuan mentioned is good. I only see a total of 14 reads supporting at position 41,570,158 of chromosome 15 -- hopefully its the same BAM file as the one used in DeepVariant.
Now regarding IGV here are a few things:
$1)
$ Under View > Preferences > Alignments set the following three:
Uncheck "Downsample reads" like this:
Check "Show mismatch bases":
Check "Label indels > threshold", and have a value of 0 (and uncheck Hide indels):
$2)
$ Under View > Preferences > Third Gen use these settings:
Here again perform the following:
Let me know if it helps.
Thanks, Paul
Hi Both, thank you so much for your replies! I really appreciate it. I adjusted the settings with your additions Paul but got the same result as I put above. Definitely the same bam and vcf sample, I just double checked. Hopefully just a realignment thing like Pi-Chuan mentioned. We may end up getting them validated just to make sure we're not throwing any potentials away! Hard choice!
Hmm, having no doubts is essential. Would you be comfortable with posting a few complete rows from the VCF file, which would include this variant and a few above and below. DeepVariant is base-specific, so it would tell us something that might be captured by a few rows.
Also could you post the whole script and command you used to run DeepVariant? You can mask out anything that is sensitive.
Just trying to be thorough we're not missing anything.
Thanks, `p
Hi Paul,
From that sample:
chr15 | 41568345 | . | C | A | 0 | RefCall | . | GT:GQ:DP:AD:VAF:PL | 0/0:24:14:12,2:0.142857:0,23,42 |
---|---|---|---|---|---|---|---|---|---|
chr15 | 41568363 | . | G | GA | 0 | RefCall | . | GT:GQ:DP:AD:VAF:PL | 0/0:35:12:10,2:0.166667:0,35,47 |
chr15 | 41570158 | . | T | C | 6.5 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:7:23:14,9:0.391304:5,0,46 |
chr15 | 41570603 | . | G | T | 10.6 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:11:31:8,23:0.741935:10,0,39 |
chr15 | 41573327 | . | G | T | 11.1 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:11:21:12,9:0.428571:10,0,43 |
chr15 | 41669918 | . | C | T | 0 | RefCall | . | GT:GQ:DP:AD:VAF:PL | 0/0:42:13:11,2:0.153846:0,45,43 |
chr15 | 41699117 | . | A | T | 49.2 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:43:96:44,52:0.541667:49,0,44 |
and from another sample with the same variant described also as passed:
chr15 | 41537032 | . | A | G | 38.5 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:38:76:42,34:0.447368:38,0,46 |
---|---|---|---|---|---|---|---|---|---|
chr15 | 41561123 | . | T | G | 0.1 | RefCall | . | GT:GQ:DP:AD:VAF:PL | ./.:15:19:11,3:0.157895:0,15,37 |
chr15 | 41565105 | . | T | C | 40.6 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:40:74:56,18:0.243243:40,0,55 |
chr15 | 41567477 | . | G | A | 0 | RefCall | . | GT:GQ:DP:AD:VAF:PL | 0/0:21:15:13,2:0.133333:0,20,39 |
chr15 | 41569024 | . | T | C | 8.9 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:9:12:9,3:0.25:8,0,41 |
chr15 | 41570158 | . | T | C | 6.7 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:7:26:19,7:0.269231:5,0,47 |
chr15 | 41570238 | . | A | G | 0 | RefCall | . | GT:GQ:DP:AD:VAF:PL | 0/0:35:14:9,3:0.214286:0,37,38 |
chr15 | 41570240 | . | G | T | 0 | RefCall | . | GT:GQ:DP:AD:VAF:PL | 0/0:32:13:10,2:0.153846:0,35,34 |
This was WES, and part of the script used was:
sed -n "${SLURM_ARRAY_TASK_ID}p" $EXOME_IDs_FILE | parallel -j 1 "singularity run -B /usr/lib/locale/:/usr/lib/locale/ containers/deepvariant_1.3.0.sif /opt/deepvariant/bin/run_deepvariant --model_type=WES \
--ref=$HG38_REFERENCE \
--reads=$PICARDMARKDUPLICATES_SORTEDBAM \
--regions=$BED_REGIONS \
--output_vcf=$OUTPUT_VCF \
--output_gvcf=$OUTPUT_GVCF \
--intermediate_results_dir=$INTERMEDIATE_RESULTS"
Hi @amy-houseman , I realize that I should also point you to this part of the FAQ: https://github.com/google/deepvariant/blob/r1.5/docs/FAQ.md#what-is-the-realigner-and-how-does-it-work which talks about how you can output the realigned BAM to examine.
Specifically, if you run make_examples with --norealign_reads
(or --realign_reads=false
), you can turn off the realigner. We don't recommend this by default, but can be useful if you just want to double check.
And, as mentioned in the FAQ section above, if you run in the region near by and add --make_examples_extra_args="emit_realigned_reads=true,realigner_diagnostics=/output/realigned_reads"
, you'll be able to get a small BAM file with the realigned reads so that you can take a closer look to confirm whether it's because of the realigner or not.
Amazing, thank you, I will try that tomorrow!
Hi Amy,
Not sure if you had a chance to check, but did the realigned reads in the BAM files generated by DeepVariant confirm the observed VCF variants?
Thanks, Paul
Hi Paul,
I got an error when running on my HPC so I'm just waiting for the IT people to get back to me, I assume its a simple fix since I have managed to get the deepvariant container to work a few months back:
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.UTF-8"
are supported and installed on your system.
I'll let you know when they reply! Amy
Hi Amy,
That's just a warning that you can ignore/fix it via binding /usr/lib/locale
, at the beginning of your Singularity command like this:
singularity run -B /usr/lib/locale/:/usr/lib/locale/
By the way, you ran into it last year in a previous issue, that didn't affect your computation in the following post:
https://github.com/google/deepvariant/issues/542
Notice the computation continued ignoring your locale settings.
Is that the only warning you see, or is the computation stuck? Also, just a minor thing, you're running DeepVariant 1.3.0, and now it is at version 1.5.0, and with the next release the plan is before the end of this year. It probably would not affect your results significantly. You can check on the changes between releases at the following page,
Thanks, Paul
Oo likely my fault! If I am adding the particular region around that variant such as "chr15:41,132,484-42,007,831"
Do I still have to provide a WES bed file? I have this as my current script and last time it seemed like the bed file was the problem..
#!/bin/bash --login
#SBATCH -J AmyHouseman_deepvariant
#SBATCH -o %x.stdout.%J.%N
#SBATCH -e %x.stderr.%J.%N
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH -p htc
#SBATCH --account=scw1581
#SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=HousemanA@cardiff.ac.uk # Where to send mail
#SBATCH --array=1-2
#SBATCH --mem-per-cpu=68GB
#SBATCH --qos=maxjobs500
module purge
module load parallel
module load singularity
EXOME_IDs_FILE=/scratch/c.c21087028/checking_variant_deepvariant/exome_ID_file
HG38_REFERENCE=/scratch/c.c21087028/coverage_graph_and_clincnv_files/ClinCNV/hg38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
PICARDMARKDUPLICATES_SORTEDBAM=/scratch/c.c21087028/checking_variant_deepvariant/{}PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam
REGIONS="chr15:41,132,484-42,007,831"
OUTPUT_VCF=/scratch/c.c21087028/checking_variant_deepvariant/{}PE_output_GCA_000001405.15_GRCh38_no_alt_analysis_set_checkingvariantregion.vcf.gz
OUTPUT_GVCF=/scratch/c.c21087028/checking_variant_deepvariant/{}PE_output_GCA_000001405.15_GRCh38_no_alt_analysis_set_checkvariantregion.g.vcf.gz
INTERMEDIATE_RESULTS=/scratch/c.c21087028/checking_variant_deepvariant/{}PE_output_intermediate
# Set bash error trapping to exit on first error.
set -eu
cd /scratch/c.c21087028/
sed -n "${SLURM_ARRAY_TASK_ID}p" $EXOME_IDs_FILE | parallel -j 1 "singularity run -B /usr/lib/locale/:/usr/lib/locale/ containers/deepvariant_1.3.0.sif /opt/deepvariant/bin/run_deepvariant --model_type=WES \
--ref=$HG38_REFERENCE \
--reads=$PICARDMARKDUPLICATES_SORTEDBAM \
--regions=$REGIONS \
--make_examples_extra_args="emit_realigned_reads=true,realigner_diagnostics=/output/realigned_reads" \
--output_vcf=$OUTPUT_VCF \
--output_gvcf=$OUTPUT_GVCF \
--intermediate_results_dir=$INTERMEDIATE_RESULTS"
with the error:
***** Intermediate results will be written to /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_output_intermediate in docker. ****
***** Running the command:*****
time seq 0 0 | parallel -q --halt 2 --line-buffer /opt/deepvariant/bin/make_examples --mode calling --ref "/scratch/c.c21087028/coverage_graph_and_clincnv_files/ClinCNV/hg38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna" --reads "/scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam" --examples "/scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_output_intermediate/make_examples.tfrecord@1.gz" --emit_realigned_reads --gvcf "/scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_output_intermediate/gvcf.tfrecord@1.gz" --realigner_diagnostics "/output/realigned_reads" --regions "chr15:41,132,484-42,007,831" --task {}
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
I0807 12:32:36.932506 47023237326656 genomics_reader.py:222] Reading /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam with NativeSamReader
W0807 12:32:36.932703 47023237326656 make_examples_core.py:276] No non-empty sample name found in the input reads. DeepVariant will use default as the sample name. You can also provide a sample name with the --sample_name argument.
I0807 12:32:36.943572 47023237326656 make_examples_core.py:239] Preparing inputs
I0807 12:32:36.953401 47023237326656 genomics_reader.py:222] Reading /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam with NativeSamReader
I0807 12:32:36.964995 47023237326656 make_examples_core.py:239] Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM']
I0807 12:32:36.975915 47023237326656 make_examples_core.py:239] Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
2023-08-07 12:32:36.976266: I third_party/nucleus/io/sam_reader.cc:736] Setting HTS_OPT_BLOCK_SIZE to 134217728
I0807 12:32:37.440702 47023237326656 genomics_reader.py:222] Reading /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam with NativeSamReader
I0807 12:32:37.546969 47023237326656 genomics_reader.py:222] Reading /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam with NativeSamReader
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 180, in <module>
app.run(main)
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/absl_py/absl/app.py", line 299, in run
_run_main(main, args)
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/absl_py/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 170, in main
make_examples_core.make_examples_runner(options)
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 1630, in make_examples_runner
region_processor.initialize()
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 870, in initialize
self._initialize()
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 853, in _initialize
self.realigner = realigner.Realigner(
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/realigner/realigner.py", line 559, in __init__
self.diagnostic_logger = DiagnosticLogger(self.config.diagnostics)
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/realigner/realigner.py", line 327, in __init__
self._csv_file = open(self._root_join(self.metrics_filename), 'w')
File "/tmp/Bazel.runfiles_mpmtqvy4/runfiles/com_google_deepvariant/deepvariant/realigner/realigner.py", line 346, in _root_join
tf.io.gfile.makedirs(subdir)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 514, in recursive_create_dir_v2
_pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.PermissionDeniedError: /output; Read-only file system
parallel: This job failed:
/opt/deepvariant/bin/make_examples --mode calling --ref /scratch/c.c21087028/coverage_graph_and_clincnv_files/ClinCNV/hg38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --reads /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_markedduplicates_GCA_000001405.15_GRCh38_no_alt_analysis_set.bam --examples /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_output_intermediate/make_examples.tfrecord@1.gz --emit_realigned_reads --gvcf /scratch/c.c21087028/checking_variant_deepvariant/E036-H-013_TAAGGCGA-ACTGCATA_L007_PE_output_intermediate/gvcf.tfrecord@1.gz --realigner_diagnostics /output/realigned_reads --regions chr15:41,132,484-42,007,831 --task 0
real 0m3.220s
user 0m1.999s
sys 0m0.838s
This has a simple fix. Basically you need to replace /output/realigned_reads
with a location you have write-access to. For example, you can do the following commands:
Assuming you have write-access to this folder /scratch/c.c21087028/checking_variant_deepvariant
, you can create a sub-directory called realigned_reads
like this:
mkdir /scratch/c.c21087028/checking_variant_deepvariant/realigned_reads
Then in your Singularity script you have two options to pick from:
--make_examples_extra_args="emit_realigned_reads=true,realigner_diagnostics=/scratch/c.c21087028/checking_variant_deepvariant/realigned_reads" \
All other lines for Option 1 remain the same.
sed -n "${SLURM_ARRAY_TASK_ID}p" $EXOME_IDs_FILE | parallel -j 1 "singularity run -B /usr/lib/locale/:/usr/lib/locale/ -B /scratch/c.c21087028/checking_variant_deepvariant/:/output/ containers/deepvariant_1.3.0.sif /opt/deepvariant/bin/run_deepvariant --model_type=WES \
The rest of the lines can stay the same for Option 2.
Regarding a BED file, you don't need one as they are the same thing as regions -- which you already provide:
https://en.wikipedia.org/wiki/BED_(file_format)
Let me know how this runs, and if your run into any other issues.
Hope it helps, Paul
Hi Paul, that worked! I have the output files now, which would be best to look at first to see what's happening? This is the realignment bam for one of samples with the original bam below it.
Thanks! Amy
Hi Amy,
Great! IGV is fine. Now just look in your VCF file for the region in question, and try to confirm with what you see in IGV using the realigned BAMs. For example, you had the following variant the last time:
chr15 | 41570158 | . | T | C | 6.5 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:7:23:14,9:0.391304:5,0,46 |
---|
Do you still see that variant in your current VCF file, and do the number of variations match (or exceed because of base quality) in IGV for that position? Given the counts for the alternate allele (C) the last time, it would need to have been at least 9, but I just see 1 in your picture. Does the VCF this time also reflect 1 (or less), or does it still say 9 for the alternate allele (C) as the last time?
Basically the numbers have to confirm each other between the realigned BAM and current VCF if realignment has been performed for that region, or the original BAM and current VCF if no realignment was performed for that region. Ideally as a first pass check multiple regions to be sure, but start with our original variant (T/C at chr15:41570158) as a known starting point. If they confirm each other then that's awesome, otherwise we would need to investigate the issue deeper.
Thanks, Paul
Looking at @amy-houseman 's latest IGV in https://github.com/google/deepvariant/issues/691#issuecomment-1669163372 , it seems like the realigned BAM would see 1 read containing the alt allele C. If this is the case, I don't expect it to even trigger our candidate generation logic. (From this post, it didn't seem like @amy-houseman has specified different vsc_ thresholds)
One possibility is that the realignment BAM was generated with different region compared with when the variant calling was run, like @pgrosu mentioned above.
Hi both,
I reran the entire bed file both with and without the --make_examples_extra_args="emit_realigned_reads=true,realigner_diagnostics=/scratch/c.c21087028/checking_variant_deepvariant/realigned_reads" \
because I think the regions I used before it didn't like so just decided to do the entire file.
Both of the entries for that variant are the same:
Without realign:
chr15 | 41570158 | . | T | C | 6.5 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:7:23:14,9:0.391304:5,0,46 |
---|
With realign:
chr15 | 41570158 | . | T | C | 6.5 | PASS | . | GT:GQ:DP:AD:VAF:PL | 0/1:7:23:14,9:0.391304:5,0,46 |
---|
When looking at the realigned bam's it actually has the realignment directory "chr15:41570025-41570158" so using that bam with the original, I get - with the C's actually being identified this time:
Sorry for the confusion!
Hi Amy,
That makes perfect sense now, and the realigned BAM file confirms the counts within expected values produced by the VCF. Basically it will realign unless you add the --norealign_reads
(or --realign_reads=false
), as Pi-Chuan mentioned earlier.
So when you used the regular BAM file it realigned the reads -- because the above parameter is has a default value of True
-- and when you used the realigned BAM file, it didn't need to align much or at all. The parameters used are defined as follows:
realign_reads
-> If True
(the default value) then it will locally realign reads before calling variants.
emit_realigned_reads
-> This will produce realigned reads if the realigner_diagnostics
variable is also enabled.
realigner_diagnostics
-> If this variable is not empty (i.e. set with a path), then the above and the DeBruijn graph will be saved, otherwise if it is empty the realigned BAM or Graphviz (dot) files will not be saved.
There is always more that can be done if you really want to be sure, but this is fairly satisfactory. By the way, VCF files can also be loaded in IGV as well so that the comparison can be done directly.
Hope it helps, ~p
Thank you both so much, you have been more than helpful!
Hi again, sorry to be a pain! I thought I would comment under this one again since the topic is the same. I have a variant that is called in my VCF file:
However I cannot find it in the bams nor prior to realignment or after realignment (I tried with region specific and then using the entire bed region).
Top is the VCF file. Middle is the realignment bam file. Bottom is the bam file before deepvariant.
I am not sure where to go from here, thanks! Amy
Hi @amy-houseman Thank you for following up. Is it possible for you to share a small BAM that can reproduce this IGV that you're looking at? If so, I can take a look and see if I can reproduce the issue and try to understand what's going on.
Hi! Thank you for your reply! Yes I can - could I send via email or some other way? Thanks!! Amy
@amy-houseman Sure. Please don't send any sensitive data, and try to restrict to just a small region that we can reproduce. You can email to pichuan@google.com.
Expanding on my answer in https://github.com/google/deepvariant/issues/691#issuecomment-1662968609
I'll post my commands that I shared with @amy-houseman in email, in case it's useful for other users in the future.
I ran this on a small BAM file and BED file that @amy-houseman shared. I'll call them "YOUR.bam" and "YOUR.bed" below.
BIN_VERSION="1.6.0"
docker run \
-v ${PWD}:/data \
-v ${PWD}/reference:/reference \
google/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/run_deepvariant \
--model_type=WES \
--ref "/reference/GRCh38_no_alt_analysis_set.fasta" \
--reads "/data/YOUR.bam" \
--output_vcf=/data/output.vcf.gz \
--num_shards=1 \
--regions "/data/YOUR.bed" \
--make_examples_extra_args="emit_realigned_reads=true,realigner_diagnostics=/data/realigned_reads"
Specifically, this step is the one that we need to look at:
time docker run \
-v ${PWD}:/data \
-v ${PWD}/reference:/reference \
google/deepvariant:1.6.0 \
/opt/deepvariant/bin/make_examples \
--mode calling \
--ref "/reference/GRCh38_no_alt_analysis_set.fasta" \
--reads "/data/YOUR.bam" \
--examples "/data/output.tfrecord.gz" \
--channels "insert_size" \
--regions "/data/YOUR.bed"
In this case, I have a few positions that I was interested in looking at.
To look at the positions, the output directories are in:
$ ls -lh realigned_reads/chr15:41562546-41562720
total 8.0K
-rw-r--r-- 1 root root 5.7K Mar 10 06:14 realigned_reads.bam
$ ls -lh realigned_reads/chr15:41570025-41570158
total 8.0K
-rw-r--r-- 1 root root 5.3K Mar 10 06:14 realigned_reads.bam
Sort and index them:
RANGE=41562546-41562720
samtools sort realigned_reads/chr15:${RANGE}/realigned_reads.bam > ${RANGE}.sorted.bam
samtools index ${RANGE}.sorted.bam
RANGE=41570025-41570158
samtools sort realigned_reads/chr15:${RANGE}/realigned_reads.bam > ${RANGE}.sorted.bam
samtools index ${RANGE}.sorted.bam
Output:
$ ls -lh *sorted.bam
-rw-rw-r-- 1 pichuan docker 7.2K Mar 10 06:36 41562546-41562720.sorted.bam
-rw-rw-r-- 1 pichuan docker 6.8K Mar 10 06:36 41570025-41570158.sorted.bam
Then, from here, I looked at these realigned BAMs and was able to see why the variants were proposed.
Hello,
This may be a naive question but I have some variants which were detected as PASS by DeepVariant - however I cannot see this variants within the bam files?
Any ideas? I likely will have to filter these variants out anyway but wanted to double check if there could be reasons causing this.
For reference: HG38 Chr15:41570158 T>C The depth is 14,9 from the VCF.
This is the region in the bam:
I know this is probably an IGV question but thought I would just ask incase!
Thanks!! Amy