Closed michoug closed 6 years ago
Best guess is that the MAGs/bins used during the "taxon_profile" command which produced the results in your "N4F2_filtered_taxons" directory has changed. Any chance a MAG/bin was added or removed to the "N4F2_filtered" directory? Specifically, the bin named "N4F2_MBin.3"?
I tried to rerun the taxon_profile and it didn't solve the issue
If you can send me the bins (or ideally a subset of bins) along with the commands that result in the issue I can look into it on my end. I would need all the RefineM commands you ran and not just the "ssu_erroneous" command so that I can replicate each of your steps.
Hi, I'm getting a similar error as michoug. Did you figure out that problem?
$ refinem ssu_erroneous bins metabat_refinem/taxon_profile $SSUDB $REFTAX ONOMY metabat_refinem/ssu -x fa [2018-05-31 09:54:19] INFO: RefineM v0.0.23 [2018-05-31 09:54:19] INFO: refinem ssu_erroneous bins metabat_refinem/taxon_profile /labs/asbhatt/bsiranos/refinem_db/gtdb_r80_ssu_db. 2018-01-18.fna /labs/asbhatt/bsiranos/refinem_db/gtdb_r80_taxonomy.2017-12-15.tsv metabat_refinem/ssu -x fa [2018-05-31 09:54:19] INFO: Identifying SSU rRNA genes. [2018-05-31 09:56:16] INFO: Extracting SSU rRNA genes. [2018-05-31 09:56:17] INFO: Classifying SSU rRNA genes. [2018-05-31 09:56:38] INFO: Identifying scaffolds with 16S rRNA genes with divergent taxonomic classification. Unexpected error: <type 'exceptions.KeyError'> Traceback (most recent call last): File "/home/bsiranos/miniconda3/envs/mgwf/bin/refinem", line 396, in <module> parser.parse_options(args) File "/home/bsiranos/miniconda3/envs/mgwf/lib/python2.7/site-packages/refinem/main.py", line 689, in parse_options self.ssu_erroneous(options) File "/home/bsiranos/miniconda3/envs/mgwf/lib/python2.7/site-packages/refinem/main.py", line 335, in ssu_erroneous options.output_dir) File "/home/bsiranos/miniconda3/envs/mgwf/lib/python2.7/site-packages/refinem/ssu.py", line 537, in erroneous if r not in common_taxa[gid]: KeyError: 'bin.2'
There is a 'bin.2' folder in the ssu output directory:
$ ls metabat_refinem/ssu/bin.2 ssu.blastn.tsv ssu.fna ssu.hmm_archaea.txt ssu.hmm_bacteria.txt ssu.hmm_euk.txt ssu.hmm_summary.tsv ssu.taxonomy.tsv
And bin.2 filed in the taxon profile output:
$ ls metabat_refinem/taxon_profile/bin_reports/bin.2.* metabat_refinem/taxon_profile/bin_reports/bin.2.filtered_genes.gene.tsv metabat_refinem/taxon_profile/bin_reports/bin.2.filtered_genes.profile.tsv metabat_refinem/taxon_profile/bin_reports/bin.2.filtered_genes.scaffolds.tsv
Let me know if you want more information or to send you some files.
What is the full name of "bin.2" is it "bin.2.fna" or something similar? I'm wondering if this is a parsing error by RefineM due to having a "." in the bin filename.
yes the bins are named bin.[#].fa, they are the output of metabat
Can you try renaming the bin to "bin_2.fa" and see if this resolves the problem?
I renamed all the bins to follow that convention and ran all the steps in the readme without problems. Previously all the other steps were fine, just the ssu_erroneous step was breaking. Do you think this is an easy fix or should I just be content with having to rename bins and name the results back?
Thanks for exploring the issue. It should be an easy fix. I'll aim to release a new version of RefineM tomorrow.
Hello. I am unable to produce the issue on my end. I was able to run a bin named bin.0.fna through RefineM v0.0.23 without issue. Any chance you changed the name of your bin files at any point?
I did go through the code and can't see why the name of the bin file would be an issue.
Hi, after trying this again it seems like the error was coming from some output missing from the taxon profile command. After making sure everything leading up to the ssu_erroneous step ran correctly it's not a problem any more. Thanks for looking into it though!
Excellent. Glad it is working.
Hi Very nice pipeline, however I have an issue with the ssu_erroneous command Here is the log.
Any ideas ? I tried on other files and got the same error