I am trying to run WGD v2 on some transcriptome data. I have successfully run wgd dmd on each sample independently (e.g., wgd dmd Sample1.fasta -of). However, when I try to do pairwise (e.g., wgd dmd Sample1.fasta Sample2.fasta -of) I get this error with some samples:
Traceback (most recent call last):
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/bin/wgd", line 8, in <module>
sys.exit(cli())
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/cli.py", line 117, in dmd
_dmd(**kwargs)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/cli.py", line 155, in _dmd
Parallel(n_jobs=nthreads,backend='multiprocessing')(delayed(parallelrbh)(s,i,j,ogformat,cscore,eval) for i,j in pairs)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/joblib/parallel.py", line 789, in __call__
self.retrieve()
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/joblib/parallel.py", line 699, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
put(task)
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/site-packages/joblib/pool.py", line 372, in send
self._writer.send_bytes(buffer.getvalue())
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/ermoore3/miniconda2/envs/mamba/envs/wgdv2/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
I believe this error is because the input files are too large. I believe this because when running wgd dmd for a single species (e.g., wgd dmd Sample1.fasta -of), the size of the resulting .tsv file for the failing samples are ~8x larger than the other samples that ran successfully.
Do you believe I am correct? If so, do you have any suggestions on how to fix this?
Hello!
I am trying to run WGD v2 on some transcriptome data. I have successfully run
wgd dmd
on each sample independently (e.g.,wgd dmd Sample1.fasta -of
). However, when I try to do pairwise (e.g.,wgd dmd Sample1.fasta Sample2.fasta -of
) I get this error with some samples:I believe this error is because the input files are too large. I believe this because when running
wgd dmd
for a single species (e.g.,wgd dmd Sample1.fasta -of
), the size of the resulting .tsv file for the failing samples are ~8x larger than the other samples that ran successfully.Do you believe I am correct? If so, do you have any suggestions on how to fix this?
Any help is appreciated!
Best, Erika