Closed lyrk50 closed 4 years ago
Hi,
I think it might be a sign that one of the Python processes was killed. Is it possible that you ran out of memory or disk space? Have you tried to reproduce it (using --resume-from polishing
)?
Mikhail
Hi,Mikhail
Thanks for your reply.
The genome size is about 4Gb. I have tried to reproduce it (using --resume-from polishing
). I get the following:
[2020-04-22 11:25:37] INFO: Running Flye polisher
[2020-04-22 11:25:37] INFO: Polishing genome (1/1)
[2020-04-22 11:26:20] INFO: Running minimap2
[2020-04-23 10:59:02] INFO: Separating alignment into bubbles
Traceback (most recent call last):
File "/he_lab/share/data/local/Flye-2.7/bin/flye", line 25, in <module>
sys.exit(main())
File "/he_lab/share/data/local/Flye-2.7/flye/main.py", line 782, in main
_run_polisher_only(args)
File "/he_lab/share/data/local/Flye-2.7/flye/main.py", line 518, in _run_polisher_only
output_progress=True)
File "/he_lab/share/data/local/Flye-2.7/flye/polishing/polish.py", line 95, in polish
bubbles_file)
File "/he_lab/share/data/local/Flye-2.7/flye/polishing/bubbles.py", line 129, in make_bubbles
raise error_queue.get()
MemoryError
The Linux server free memory is 458Gb, buffers memory is 291Gb. The disk space is large enough. According to the flye.log, I can see that Total RAM: 2015 Gb Available RAM: 872 Gb What do you think is the problem? I'm looking forward to your reply.
Lynn
Yes, this output suggests that one of the threads was killed due to a memory error. What is the dataset coverage? Can you send the full log file?
If the coverage is high, you might want to downsample (maybe to 50x), or run polisher with the reduced number of threads.
Sometimes servers have tricky limitations wrt to total memory / memory per thread etc. This also could be the reason, given that the server has a lot of RAM.
The dataset coverge is about 70-80x and I ran polisher with 40 threads. The following is the full flye.log
[2020-04-22 11:25:37] root: INFO: Running Flye polisher
[2020-04-22 11:25:37] root: DEBUG: Cmd: /he_lab/share/data/local/Flye-2.7/bin/flye --polish-target out_flye/30-contigger/contigs.fasta --nano-raw allpass.fastq.gz --iterations 1 --out-dir out_flye/polish --threads 40
[2020-04-22 11:25:37] root: INFO: Polishing genome (1/1)
[2020-04-22 11:26:20] root: INFO: Running minimap2
[2020-04-23 10:59:02] root: INFO: Separating alignment into bubbles
I changed the server that may have enough memory, but I met another problem. This is the flye.log
[2020-04-25 08:42:09] root: INFO: Running Flye polisher
[2020-04-25 08:42:09] root: DEBUG: Cmd: /GPUFS/Flye-2.7/bin/flye --polish-target /GPUFS//contigs.fasta --nano-raw /allpass.fastq.gz --iterations 1 --out-dir /GPUFS/ --threads 60
[2020-04-25 08:42:09] root: INFO: Polishing genome (1/1)
[2020-04-25 08:43:19] root: INFO: Running minimap2
[2020-04-26 01:14:39] root: INFO: Separating alignment into bubbles
The following is the nohup.out
[2020-04-25 08:42:09] INFO: Running Flye polisher
[2020-04-25 08:42:09] INFO: Polishing genome (1/1)
[2020-04-25 08:43:19] INFO: Running minimap2
[2020-04-26 01:14:39] INFO: Separating alignment into bubbles
[E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes
[main_samview] truncated file.
samtools view: error closing "/GPUFS/minimap_1.bam": -1
Process Process-2:
Traceback (most recent call last):
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/GPUFS/sysu_mhwang_1/sysu_mhwang_1/Flye-2.7/flye/utils/sam_parser.py", line 246, in _io_thread_worker
with self.shared_lock:
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/managers.py", line 991, in __enter__
return self._callmethod('acquire')
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/managers.py", line 756, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/app/common/anaconda3/5.2.0/lib/python3.6/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
This species has a large genome with high repetitive sequence. Is this the point that cause this error? Thanks for the assistance, lynn
Looks like the bam file is corrupted, but I'm not sure what was the reason.
Possibly, you are running out of space, or hitting the open files limit. Try (1) downsample to ~40x and (2) check the number of max open files by running ulimit -Sn
. You can try to increase it by running ulimit -n 4096
(or so).
@lyrk50 any updates?
Closed since no activity.
Hi!
When trying to polish some nanopore data
flye --polish-target out_flye/30-contigger/contigs.fasta --nano-raw allpass.fastq.gz --iterations 1 --out-dir out_flye/polish --threads 40
, I get the following:Any advice to help me get up and running would be appreciated.