google / deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
BSD 3-Clause "New" or "Revised" License
3.25k stars 728 forks source link

Program always run #855

Closed liukeweiaway closed 4 months ago

liukeweiaway commented 4 months ago

Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md: YES

Describe the issue:

  1. When the input sequence (fastq) matches the reference sequence, the program will keep running.
  2. Sequence obtain from data generation tools. (dwgsim)

Setup

Steps to reproduce:

real 0m4.791s user 0m11.503s sys 0m2.085s

Running the command: time /opt/deepvariant/bin/callvariants --outfile "/tmp/tmpkcjcf0p/call_variantsoutput.tfrecord.gz" --examples "/tmp/tmpkcjcf0p/make_examples.tfrecord@4.gz" --checkpoint "/opt/models/wes"

/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning:

TensorFlow Addons (TFA) has ended development and introduction of new features. TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).

For more information see: https://github.com/tensorflow/addons/issues/2807

warnings.warn( I0729 14:44:41.088234 139722246891328 call_variants.py:471] Total 1 writing processes started. W0729 14:44:41.090612 139722246891328 callvariants.py:482] Unable to read any records from /tmp/tmpkcjcf0p/make_examples.tfrecord@4.gz. Output will contain zero records. I0729 14:44:41.091079 139722246891328 call_variants.py:623] Complete: call_variants.

Does the quick start test work on your system? yes

Any additional context: Some samples work fine, some very similar samples keep running

kishwarshafin commented 4 months ago

@liukeweiaway are these human samples? Looks like the program is running fine, it's just not finding any variants. Can you please explain a bit more to what exactly is your data?

liukeweiaway commented 4 months ago

@liukeweiaway are these human samples? Looks like the program is running fine, it's just not finding any variants. Can you please explain a bit more to what exactly is your data?

It is a human sample, and the generated data is the same as the reference genome. When no mutation is detected, the program will not stop and will continue to run. You need to stop the program manually. chr6_CYP21A2.bwa.read1.fastq.gz chr6_CYP21A2.bwa.read2.fastq.gz

kishwarshafin commented 4 months ago

@liukeweiaway ,

I see, can you please update to 1.6.1? I think you are getting stuck in the bug of 1.6.0 that we fixed in 1.6.1