Open Ivan-vechetti opened 3 years ago
It seems to be caused by:
def iopen(inpath, mode='r'):
""" Open input file for reading regardless of compression [gzip, bzip] or python version """
ext = inpath.split('.')[-1]
# Python2
if sys.version_info[0] == 2:
if ext == 'gz': return gzip.open(inpath, mode)
elif ext == 'bz2': return bz2.BZ2File(inpath, mode)
else: return open(inpath, mode)
# Python3
elif sys.version_info[0] == 3:
if ext == 'gz': return io.TextIOWrapper(gzip.open(inpath, mode))
elif ext == 'bz2': return bz2.BZ2File(inpath, mode)
else: return open(inpath, mode)
which is called by species_pileup()
in pysam_pileup()
. I'm guessing that the file handler is not actually closed in the subprocess, which is causing the serialization error.
Actually, it seems to be due to passing the file hander in the args['log']
variable to species_pileup()
via utility.parallel()
. The file hander can't be serialized.
Changing:
def pysam_pileup(args, species, contigs):
start = time()
print("\nCounting alleles")
args['log'].write("\nCounting alleles\n")
# run pileups per species in parallel
argument_list = []
to:
def pysam_pileup(args, species, contigs):
start = time()
print("\nCounting alleles")
args['log'].write("\nCounting alleles\n")
args['log'].close() # new line
# run pileups per species in parallel
argument_list = []
Fixes the issue. It appears that the log file isn't actually written to by species_pileup
anyway. I'll submit a PR
Thank you so much for your reply. Where should I change that line?
Thanks once again
Ivan
On Sat, Dec 19, 2020 at 7:27 AM Nick Youngblut notifications@github.com wrote:
Actually, it seems to be due to passing the file hander in the args['log'] variable to species_pileup() via utility.parallel(). The file hander can't be serialized.
Changing:
def pysam_pileup(args, species, contigs): start = time() print("\nCounting alleles") args['log'].write("\nCounting alleles\n")
# run pileups per species in parallel argument_list = []
to:
def pysam_pileup(args, species, contigs): start = time() print("\nCounting alleles") args['log'].write("\nCounting alleles\n") args['log'].close() # new line
# run pileups per species in parallel argument_list = []
Fixes the issue. It appears that the log file isn't actually written to by species_pileup anyway. I'll submit a PR
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/snayfach/MIDAS/issues/112#issuecomment-748474903, or unsubscribe https://github.com/notifications/unsubscribe-auth/APD5TZLAWRNY7DXMTCYWR6TSVSS5BANCNFSM4VAJBYCQ .
Check out the PR edits: https://github.com/snayfach/MIDAS/pull/113
Hi Nick,
thanks for the input, but adding the line as you suggested caused the run to an early finish with this message: IndentationError: unindent does not match any outer indentation level
Regarding the gene run, the message below is normal?
Computing coverage of pangenomes E::idx_fin_and_load Could not retrieve index file for 'midas_output//genes/temp/pangenomes.bam'
On Sat, Dec 19, 2020 at 10:18 AM Nick Youngblut notifications@github.com wrote:
Check out the PR edits: #113 https://github.com/snayfach/MIDAS/pull/113
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/snayfach/MIDAS/issues/112#issuecomment-748493905, or unsubscribe https://github.com/notifications/unsubscribe-auth/APD5TZOKQIMW76MDI66KVNDSVTG3VANCNFSM4VAJBYCQ .
My editor defaults to spaces, but MIDAS is written all with tabs. This caused the indentation error. I've fixed it. Also, I added a pop for the log variable, since it appears that closing the file handler didn't actually fix the serialization error. It should work now. At least, it works for me. There's no CI for the PRs, so it's untested for a broader set of envs (eg., different version of Ubuntu), but it should work.
I tried this (although del instead of pop). This works for me.
Hello, running
Goes well but in the end, I get: E::idx_fin_and_load Could not retrieve index file for 'midas_output//genes/temp/pangenomes.bam'
And then when I run:
Goes well but in the end, I get: TypeError: cannot pickle '_io.TextIOWrapper' object
Can someone help me with that?
Python 3.8.5