Closed bioinformaticaomicalabs closed 2 months ago
Hi!
Thanks for the kind words. The issue seems to be that call_variants
expects a JSON file with the checkpoint.
model_example_info_json = f'{checkpoint_path}/example_info.json'
model_example_shape = dv_utils.get_shape_and_channels_from_json(
model_example_info_json
)
Can you confirm that this file exists under "/output/checkpoints/ckpt-679"
? If it exists, could you paste the content of it here?
Hi!
Thanks for the kind words. The issue seems to be that
call_variants
expects a JSON file with the checkpoint.model_example_info_json = f'{checkpoint_path}/example_info.json' model_example_shape = dv_utils.get_shape_and_channels_from_json( model_example_info_json )
Can you confirm that this file exists under
"/output/checkpoints/ckpt-679"
? If it exists, could you paste the content of it here?
Hello, first, thank you very much for your help. Yes, the example_info.json exist, but do not have any content. Should it contain some information?
@bioinformaticaomicalabs , yes it should contain some information like this:
gsutil cat gs://deepvariant/models/DeepVariant/1.6.1/checkpoints/wgs/example_info.json
{"version": "1.6.0", "shape": [100, 221, 7], "channels": [1, 2, 3, 4, 5, 6, 19]}⏎
Can you check if you have example_info.json
output in your training data and validation data generation folders and if they are the same? If same then you can copy it to the directory and use it. The training loop is supposed to copy the example_info.json
from training folder to the checkpoint output directory. Not sure if it was missing in your setup.
Thank you very much, that finally resolved the issue. What happens now is that after performing the evaluation (the training, test, and validation sets are the sequencing of the NA12878 sample from the Coriell Institute sequenced three times), I’m getting low recall and precision values. For indels, recall is 0.41 and precision is 0.24, and for SNPs, recall is 0.57 and precision is 0.72. I tried the model you provided for exome sequencing, but I didn’t achieve better results (which is why I decided to create my own model). However, with typical tools (like GATK HaplotypeCaller), I get much better results (indels with 0.8 recall and 0.62 precision, and SNPs with 0.89 recall and 0.97 precision). Do you have any idea why this might be happening and any advice on how to solve it? I really believe that using a variant caller trained with my data should yield better results than GATK HaplotypeCaller
@bioinformaticaomicalabs , if you have a benchmarking bam file that you can share then I can try to run it on our default model and see if it matches your expectations. However, why training didn't work would require some more investigation on how exactly you are training and preparing your data.
Hello, the bam file es 13 GB, do you know how can i send it to you?
@bioinformaticaomicalabs if you have GCP then that would be the best way for us. Any public bucket that you currently use to share data works for us. If the data is uploaded to SRA then you can link it to me but it might take a bit longer in that case. However you prefer, please send it to shafin@google.com
@bioinformaticaomicalabs , closing this due to no activity. If you think you need further help, please reopen or send an email.
ISSUE First of all, I found DeepVariant to be a very good and innovative tool. I'm considering including it in my exome analysis pipeline. I followed the tutorial (DeepVariant worked correctly with the Complete Genomics model), and I created my own model using Genome in a Bottle samples. To do this, I sequenced the same reference sample three times to use each BAM file for training, validation, and testing. I didn't encounter any errors during the model creation process, but when I tried to test it, the process got stuck at the call_variants step.
Setup
Steps to reproduce:
Command: Create examples for trainning set
sudo docker run -v "${PWD}/input":"/input" -v "${PWD}/REF":"/ref" -v "${PWD}"/output:"/output" google/deepvariant:"1.6.1" make_examples --mode training --ref "/ref/GRCh38.p14.genome.fa" --reads "/input/26_r_groups.bam" --examples "/output/training_set.gz" --truth_variants "/ref/HG001_GRCh38_1_22_v4.2.1_benchmark_ROCHE.vcf.gz" --confident_regions "/ref/HG001_GRCh38_1_22_v4.2.1_benchmark_ROCHE.bed"
Create examples for validation setsudo docker run -v "${PWD}/input":"/input" -v "${PWD}/REF":"/ref" -v "${PWD}"/output:"/output" google/deepvariant:"1.6.1" make_examples --mode training --ref "/ref/GRCh38.p14.genome.fa" --reads "/input/27_r_groups.bam" --examples "/output/validation_set.gz" --truth_variants "/ref/HG001_GRCh38_1_22_v4.2.1_benchmark_ROCHE.vcf.gz" --confident_regions "/ref/HG001_GRCh38_1_22_v4.2.1_benchmark_ROCHE.bed"
Trainning Shufflingpython3 scripts/shuffle_tfrecords_beam.py --input_pattern_list=output/training_set.gz --output_pattern_prefix="output/training_shuffled" --output_dataset_name="26" --output_dataset_config_pbtxt="output/training.pbtxt" --job_name=shuffle-tfrecords
Validation Shufflingpython3 scripts/shuffle_tfrecords_beam.py --input_pattern_list=output/validation_set.gz --output_pattern_prefix="output/validation_shuffled" --output_dataset_name="27" --output_dataset_config_pbtxt="output/validation.pbtxt" --job_name=shuffle-tfrecords
Model trainningsudo docker run -v "${PWD}/input":"/input" -v "${PWD}/REF":"/ref" -v "${PWD}"/output:"/output" google/deepvariant:"1.6.1" train --config=/input/dv_config.py:base --config.train_dataset_pbtxt="/output/training.pbtxt" --config.tune_dataset_pbtxt="/output/validation.pbtxt" --config.num_epochs=10 --config.learning_rate=0.0001 --config.num_validation_examples=0 --strategy=mirrored --experiment_dir="/output/" --config.batch_size=512
Model testsudo docker run -v "${PWD}/input":"/input" -v "${PWD}/REF":"/ref" -v "${PWD}"/output:"/output" google/deepvariant:"1.6.1" /opt/deepvariant/bin/run_deepvariant --model_type WES --customized_model "/output/checkpoints/ckpt-679" --ref "/ref/GRCh38.p14.genome.fa" --reads "/input/33_r_groups.bam" --output_vcf "/output/33.vcf.gz" --output_gvcf "/output/33.g.vcf.gz" -intermediate_results_dir "/output/intermediate_results_dir" --num_shards 10
Error trace: ` - I0822 07:51:54.272576 127450974123840 make_examples_core.py:301] Task 17/20: Writing example info to /output/intermediate_results_dir/make_examples.tfrecord-00017-of-00020.gz.example_info.json I0822 07:51:54.272647 127450974123840 make_examples_core.py:2958] example_shape = [100, 221, 7] I0822 07:51:54.272740 127450974123840 make_examples_core.py:2959] example_channels = [1, 2, 3, 4, 5, 6, 19] I0822 07:51:54.272911 127450974123840 make_examples_core.py:301] Task 17/20: Found 17451 candidate variants I0822 07:51:54.272932 127450974123840 make_examples_core.py:301] Task 17/20: Created 18817 examples I0822 07:52:09.283522 133276175411008 make_examples_core.py:301] Task 8/20: Writing example info to /output/intermediate_results_dir/make_examples.tfrecord-00008-of-00020.gz.example_info.json I0822 07:52:09.283617 133276175411008 make_examples_core.py:2958] example_shape = [100, 221, 7] I0822 07:52:09.283712 133276175411008 make_examples_core.py:2959] example_channels = [1, 2, 3, 4, 5, 6, 19] I0822 07:52:09.283882 133276175411008 make_examples_core.py:301] Task 8/20: Found 17371 candidate variants I0822 07:52:09.283904 133276175411008 make_examples_core.py:301] Task 8/20: Created 18820 examples
real 34m15.728s user 624m43.553s sys 2m24.932s
Running the command: time /opt/deepvariant/bin/call_variants --outfile "/output/intermediate_results_dir/call_variants_output.tfrecord.gz" --examples "/output/intermediate_results_dir/make_examples.tfrecord@20.gz" --checkpoint "/output/checkpoints/ckpt-679"
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning:
TensorFlow Addons (TFA) has ended development and introduction of new features. TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).
For more information see: https://github.com/tensorflow/addons/issues/2807
warnings.warn( I0822 07:52:10.812179 127086447671104 call_variants.py:563] Total 1 writing processes started. I0822 07:52:10.813103 127086447671104 dv_utils.py:370] From /output/intermediate_results_dir/make_examples.tfrecord-00000-of-00020.gz.example_info.json: Shape of input examples: [100, 221, 7], Channels of input examples: [1, 2, 3, 4, 5, 6, 19]. I0822 07:52:10.813141 127086447671104 call_variants.py:588] Shape of input examples: [100, 221, 7] I0822 07:52:10.813338 127086447671104 call_variants.py:592] Use saved model: True Traceback (most recent call last): File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/com_google_deepvariant/deepvariant/call_variants.py", line 789, in
app.run(main)
File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/absl_py/absl/app.py", line 312, in run
_run_main(main, args)
File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/absl_py/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/com_google_deepvariant/deepvariant/call_variants.py", line 768, in main
call_variants(
File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/com_google_deepvariant/deepvariant/call_variants.py", line 598, in call_variants
model_example_shape = dv_utils.get_shape_and_channels_from_json(
File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/com_google_deepvariant/deepvariant/dv_utils.py", line 367, in get_shape_and_channels_from_json
example_info = json.load(f)
File "/usr/lib/python3.8/json/init.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Process ForkProcess-1: Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/tmp/Bazel.runfiles_5v5s5_vp/runfiles/com_google_deepvariant/deepvariant/call_variants.py", line 454, in post_processing item = output_queue.get(timeout=180) File "/usr/lib/python3.8/multiprocessing/queues.py", line 108, in get raise Empty _queue.Empty
real 3m2.335s user 0m7.450s sys 0m4.274s`
Does the quick start test work on your system? Yes