google / deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
BSD 3-Clause "New" or "Revised" License
222 stars 37 forks source link

BrokenPipeError: [Errno 32] Broken pipe #20

Closed AhmedArslan closed 2 years ago

AhmedArslan commented 2 years ago

$time deepconsensus run --subreads_to_ccs=aligned.subreads_to_ccs.bam --ccs_fasta=HiFiCCS.fasta --checkpoint=/home/models/checkpoint-50 --output=./dc/out.fastq --batch_zmws=100 --cpus=18 >& report.log

2022-01-21 12:09:55.926395: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-01-21 12:09:55.939376: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. I0121 12:09:56.290417 139850387081024 quick_inference.py:358] Finished initialize_model. I0121 12:09:56.290970 139850387081024 quick_inference.py:530] Model setup took 0.43245720863342285 seconds. I0121 12:10:06.089605 139850387081024 quick_inference.py:436] Using multiprocessing: cpus is 18. WARNING:tensorflow:From /tools/deep/lib/python3.8/site-packages/official/nlp/transformer/attention_layer.py:54: DenseEinsum.init (from official.nlp.modeling.layers.dense_einsum) is deprecated and will be removed in a future version. Instructions for updating: DenseEinsum is deprecated. Please use tf.keras.experimental.EinsumDense layer instead. W0121 12:10:14.684705 139850387081024 deprecation.py:341] From /deep/lib/python3.8/site-packages/official/nlp/transformer/attention_layer.py:54: DenseEinsum.init (from official.nlp.modeling.layers.dense_einsum) is deprecated and will be removed in a future version. Instructions for updating: DenseEinsum is deprecated. Please use tf.keras.experimental.EinsumDense layer instead. /deep/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py:242: RuntimeWarning: divide by zero encountered in log10 quality_scores = -10 * np.log10(error_prob) I0121 12:12:50.063338 139850387081024 quick_inference.py:492] Processed a batch of 100 ZMWs in 163.97370052337646 seconds I0121 12:12:50.078274 139850387081024 quick_inference.py:570] Processed 100 ZMWs in 173.787042 seconds I0121 12:15:39.261736 139850387081024 quick_inference.py:492] Processed a batch of 100 ZMWs in 161.06492471694946 seconds . . . . I0122 02:50:06.622056 139850387081024 quick_inference.py:492] Processed a batch of 100 ZMWs in 228.55985140800476 seconds I0122 02:50:06.664712 139850387081024 quick_inference.py:570] Processed 26500 ZMWs in 52810.373480 seconds Process ForkPoolWorker-4783: Traceback (most recent call last): File "/anaconda3/lib/python3.8/multiprocessing/pool.py", line 131, in worker put((job, i, result)) File "/anaconda3/lib/python3.8/multiprocessing/queues.py", line 368, in put self._writer.send_bytes(obj) File "/anaconda3/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes self._send_bytes(m[offset:offset + size]) File "/anaconda3/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes self._send(header) File "/anaconda3/lib/python3.8/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe

MariaNattestad commented 2 years ago

The BrokenPipeError doesn't come from DeepConsensus, it's simply what python does when a connection is broken to the file system. I suggest googling it if you want to investigate what happened in your case. However, in the future you probably want to parallelize running DeepConsensus so it's done in smaller chunks and you don't lose the connection after it has run for 14 hours though. Use the --chunk parameter in the ccs step to parallelize easily.

pichuan commented 2 years ago

Hi @AhmedArslan , This issue should now be fixed. If you're still seeing it our latest version (v0.3.1), please feel free to reach out again!