Closed aflores18 closed 4 years ago
Hello, thanks for bringing up this bug. Let's try to get that working. Couple questions for you, did you try doing this without the additional --cores 20 --memory 100000
commands? Also, how far did it get into the test? At what point did it fail?
Thanks.
Thanks for the quick response. Yes, I ran without the specified cores or memory and resulted in the same error. Below is the log from the generated results folder:
Current version of snakemake: 3.13.3 Expected version of snakemake: 3.13.3 Current version of einverted: EMBOSS:6.6.0.0 Expected version of einverted: EMBOSS:6.6.0.0 Current version of bowtie2: 2.3.5 Expected version of bowtie2: 2.3.5 Current version of samtools: 1.9 Expected version of samtools: 1.9 Current version of cd-hit: 4.8.1 Expected version of cd-hit: 4.8.1 ###############################
command: genotype clusterseq: test_workdir/03.results/efae_GCF_900639545/01.clusterseq.efae_GCF_900639545.tsv pairfiles: ('test_workdir/01.mgefinder/efae_GCF_900639545/efae_GCF_900639545.all_pair.txt',) filter_clusters_inferred_assembly: True output_file: test_workdir/03.results/efae_GCF_900639545/02.genotype.efae_GCF_900639545.tsv #################### Loading clusterseq file... Parsing pair files Loading pair files... Loading file 1/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1036032/02.pair.ERR1036032.efae_GCF_900639545.tsv Loading file 2/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1078789/02.pair.ERR1078789.efae_GCF_900639545.tsv Loading file 3/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1541922/02.pair.ERR1541922.efae_GCF_900639545.tsv Loading file 4/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1036049/02.pair.ERR1036049.efae_GCF_900639545.tsv Loading file 5/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1541932/02.pair.ERR1541932.efae_GCF_900639545.tsv Loading file 6/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1078777/02.pair.ERR1078777.efae_GCF_900639545.tsv Loading file 7/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1036051/02.pair.ERR1036051.efae_GCF_900639545.tsv Loading file 8/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1541798/02.pair.ERR1541798.efae_GCF_900639545.tsv Loading file 9/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1195862/02.pair.ERR1195862.efae_GCF_900639545.tsv Loading file 10/10: test_workdir/01.mgefinder/efae_GCF_900639545/ERR1541854/02.pair.ERR1541854.efae_GCF_900639545.tsv Filtering out clusters that are never inferred from an assembly... Excluding 6 clusters that were only inferred from the reference genome... Out of 585 candidate insertions, 407 had some inferred identity, while 178 had no inferred identity. Assigning initial genotypes... Identifying ambiguous genotypes... Resolving ambiguous genotypes where possible...
And here are the contents of the ".err" in the same folder:
Traceback (most recent call last):
File "/home/user/miniconda3/envs/mgefinder/bin/mgefinder", line 8, in
It appears to have stopped during the clusterseq. If there are additional output that would be helpful for you please let me now.
Tony
Thank you, this is strange, it may have something to do with a version error for the dependencies. I will work on correcting this. But in the meantime, you'll want to check if all of these version dependencies hold true:
python = 3.6.9 click = 7.0 pandas = 0.25.3 biopython = 1.75 pysam = 0.15.3 scipy = 1.4.0 networkx = 2.4 tqdm = 4.40.2
Great. Here are the versions of the above dependencies currently installed:
python - 3.6.9 click - 7.0 pandas - 0.25.3 biopython - 1.75 pysam - 0.16.0 (this is the only version not matching) scipy - 1.4.0 networkx - 2.4 tqdm - 4.40.2
Ok, try installing the correct pysam version with pip install pysam==0.15.3
Ok, I changed the setup.py
file to include version requirements, should work now if you uninstall the environment with conda env remove -n mgefinder
and then rerun bash install.sh
.
That did it. It finished the test dataset without errors. I will work on our data.
Thanks for your help!
Hello!
I'm interested in using mgefinder on our datasets and followed instructions to install through conda per the guide. I downloaded and extracted the test_workdir files as instructed. I set the environment appropriately for mgefinder in conda and invoked the following command:
$ mgefinder workflow --cores 20 --memory 100000 test_workdir/
However, it appears to have crashed with the following error:
Traceback (most recent call last): File "/home/user/miniconda3/envs/mgefinder/bin/mgefinder", line 8, in
sys.exit(cli())
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 764, in call
return self.main(args, kwargs)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(args, kwargs)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/main.py", line 251, in genotype
_genotype(clusterseq, pairfiles, filter_clusters_inferred_assembly, output_file)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/genotype.py", line 37, in _genotype
genotypes = genotyper.genotype()
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/genotype.py", line 106, in genotype
genotypes = self.resolve_ambiguous_genotypes(genotypes)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/genotype.py", line 224, in resolve_ambiguous_genotypes
unresolved, cluster_counts_per_site
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/genotype.py", line 322, in resolve_all_sample_comparison
resolved = (pd.merge(unresolved, cluster_counts, how='inner', on=['contig', 'pos_5p', 'pos_3p', 'cluster']).
File "/home/user/.local/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 61, in merge
validate=validate)
File "/home/user/.local/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 555, in init
self._maybe_coerce_merge_keys()
File "/home/user/.local/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 986, in _maybe_coerce_merge_keys
raise ValueError(msg)
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
Error in job genotype while creating output file test_workdir/03.results/efae_GCF_900639545/02.genotype.efae_GCF_900639545.tsv.
RuleException:
CalledProcessError in line 286 of /home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/Snakefile:
Command '
if [ "True" == "True" ]; then
mgefinder genotype --filter-clusters-inferred-assembly test_workdir/03.results/efae_GCF_900639545/01.clusterseq.efae_GCF_900639545.tsv test_workdir/01.mgefinder/efae_GCF_900639545/efae_GCF_900639545.all_pair.txt -o test_workdir/03.results/efae_GCF_900639545/02.genotype.efae_GCF_900639545.tsv 1> test_workdir/03.results/efae_GCF_900639545/log/efae_GCF_900639545.genotype.log 2> test_workdir/03.results/efae_GCF_900639545/log/efae_GCF_900639545.genotype.log.err || (cat test_workdir/03.results/efae_GCF_900639545/log/efae_GCF_900639545.genotype.log.err; exit 1)
else
mgefinder genotype --no-filter-clusters-inferred-assembly test_workdir/03.results/efae_GCF_900639545/01.clusterseq.efae_GCF_900639545.tsv test_workdir/01.mgefinder/efae_GCF_900639545/efae_GCF_900639545.all_pair.txt -o test_workdir/03.results/efae_GCF_900639545/02.genotype.efae_GCF_900639545.tsv 1> test_workdir/03.results/efae_GCF_900639545/log/efae_GCF_900639545.genotype.log 2> test_workdir/03.results/efae_GCF_900639545/log/efae_GCF_900639545.genotype.log.err || (cat test_workdir/03.results/efae_GCF_900639545/log/efae_GCF_900639545.genotype.log.err; exit 1)
fi
' returned non-zero exit status 1.
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/Snakefile", line 286, in __rule_genotype
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
Traceback (most recent call last):
File "/home/user/miniconda3/envs/mgefinder/bin/mgefinder", line 8, in
sys.exit(cli())
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 764, in call
return self.main(args, kwargs)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(args, kwargs)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/main.py", line 51, in workflow
_workflow(workdir, snakefile, configfile, cores, memory, unlock, rerun_incomplete, keep_going)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow.py", line 26, in _workflow
shell(cmd)
File "/home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/snakemake/shell.py", line 88, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'snakemake -s /home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/Snakefile --config wd=test_workdir/ memory=16000 --cores 20 --configfile /home/user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/config.yml ' returned non-zero exit status 1.
Obviously, would like to get the test dataset to run appropriately before trying on our own data. Most likely in my experience this is something simple but my relative inexperience leaves me baffled at this time.
Any suggestions are most welcome.
Tony