Open ramiroricardo opened 3 years ago
Hi ramiroricardo,
There are two things that I have noticed might need to be fixed:
1) For conda installed constax, you must use the -b
or --blast
flag, unless you intend to use the UTAX implementation which requires a separate download. See here for more details.
2) For some reason the vsearch
, classifier
, makeblastdb
, blastn
, and Rscript
commands are not working when executed. Try executing those commands outside the constax script to see if they are valid. If they work at all, is it only when the environment is activated?
Hi @liberjul ,
Thanks for your quick reply. When I run the same code, but with the --blast
flag, I get essentially the same output:
Welcome to CONSTAX version 2.0.9 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2020, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Training, with output to /database/UNITE/training_files...
Pathfile input not found in local directory ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.9-0/opt/constax-2.0.9/pathfile.txt ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.9-placeholder/opt/constax-2.0.9/pathfile.txt ...
Pathfile input found at /biotools/miniconda3/envs/constax2/opt/constax-2.0.9/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /biotools/miniconda3/envs/constax2/opt/constax-2.0.9
Memory size: 32000mb
Importing subscripts from /biotools/miniconda3/envs/constax2/opt/constax-2.0.9
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 1.504352509 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 247: vsearch: command not found
__________________________________________________________________________
Training BLAST Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 258: makeblastdb: command not found
__________________________________________________________________________
Training RDP Classifier
Error: Unable to access jarfile classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 322: blastn: command not found
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 334: vsearch: command not found
sed: can't read /database/UNITE/taxonomy_assignements/otu_taxonomy.sintax: No such file or directory
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 347: blastn: command not found
Error: Unable to access jarfile classifier
__________________________________________________________________________
Comparing to Isolates
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 373: makeblastdb: command not found
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 375: blastn: command not found
rm: cannot remove '/database/UNITE/taxonomy_assignements/unite_test_isos__BLAST.n*': No such file or directory
Combining Taxonomies
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/CombineTaxonomy.py", line 532, in <module>
open(file_name,"r")
FileNotFoundError: [Errno 2] No such file or directory: '/database/UNITE/taxonomy_assignements/otu_taxonomy.rdp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/CombineTaxonomy.py", line 534, in <module>
raise FileNotFoundError(F"{classifier.upper()} file could not be opened.")
FileNotFoundError: RDP file could not be opened.
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 410: Rscript: command not found
About your 2nd question, none of the tools were working outside of the environment. Though I note that I was running this not as a bash script, but directly on terminal. I had previously ran it inside a bash script, also with the same errors, while adding the following before calling constax:
source /biotools/miniconda3/etc/profile.d/conda.sh
conda activate constax2
Inside the environment, all tools appeared to be working as these produced the expected output when calling -help
, with the exception of Rscript
. I have now installed R4.0.3 to the environment and Rscript
is working. Though even after this installation, the output that I get is:
Welcome to CONSTAX version 2.0.9 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2020, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Training, with output to /database/UNITE/training_files...
Pathfile input not found in local directory ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.9-0/opt/constax-2.0.9/pathfile.txt ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.9-placeholder/opt/constax-2.0.9/pathfile.txt ...
Pathfile input found at /biotools/miniconda3/envs/constax2/opt/constax-2.0.9/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /biotools/miniconda3/envs/constax2/opt/constax-2.0.9
Memory size: 32000mb
Importing subscripts from /biotools/miniconda3/envs/constax2/opt/constax-2.0.9
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 1.523612179 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 247: vsearch: command not found
__________________________________________________________________________
Training BLAST Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 258: makeblastdb: command not found
__________________________________________________________________________
Training RDP Classifier
Error: Unable to access jarfile classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 322: blastn: command not found
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 334: vsearch: command not found
sed: can't read /database/UNITE/taxonomy_assignements/otu_taxonomy.sintax: No such file or directory
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 347: blastn: command not found
Error: Unable to access jarfile classifier
__________________________________________________________________________
Comparing to Isolates
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 373: makeblastdb: command not found
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 375: blastn: command not found
rm: cannot remove '/database/UNITE/taxonomy_assignements/unite_test_isos__BLAST.n*': No such file or directory
Combining Taxonomies
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/CombineTaxonomy.py", line 532, in <module>
open(file_name,"r")
FileNotFoundError: [Errno 2] No such file or directory: '/database/UNITE/taxonomy_assignements/otu_taxonomy.rdp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/CombineTaxonomy.py", line 534, in <module>
raise FileNotFoundError(F"{classifier.upper()} file could not be opened.")
FileNotFoundError: RDP file could not be opened.
Hello @ramiroricardo,
Can you please post what you have in your pathfile.txt? Also, it should be placed in your working directory not in the conda environment. See details https://constax.readthedocs.io/en/latest/tutorial1.html
Hi @Gian77 ,
Thanks for your reply.
My path file, that was in the conda environment folders, has the following:
export SINTAXPATH=vsearch
export RDPPATH=classifier
export CONSTAXPATH=/biotools/miniconda3/envs/constax2/opt/constax-2.0.9
Note that I did not do any modification to this, it came when I created the conda environment.
If I create a pathfile in the working directory as:
CONSTAXPATH=/biotools/miniconda3/envs/constax2/opt/constax-2.0.9
RDPPATH=/biotools/miniconda3/envs/constax2/bin/classifier
SINTAXPATH=/biotools/miniconda3/envs/constax2/bin/vsearch
and run:
constax \
--num_threads 10 \
--mem 32000 \
--db /database/UNITE/sh_general_release_04.02.2020/sh_general_release_dynamic_04.02.2020.fasta \
--train \
--input /database/UNITE/unite_test_query.fasta \
--input /database/UNITE/unite_test_query.fasta \
--isolates /database/UNITE/unite_test_isos.fasta \
--trainfile /database/UNITE/training_files \
--tax /database/UNITE/taxonomy_assignements \
--output /database/UNITE/taxonomy_assignements \
--blast \
--make_plot \
--conf 0.8 \
--make_plot \
--pathfile /database/UNITE/pathfile.txt
I get this:
Welcome to CONSTAX version 2.0.9 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2020, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Training, with output to /database/UNITE/training_files...
Pathfile input not found in local directory ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.9-0/opt/constax-2.0.9/pathfile.txt ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.9-placeholder/opt/constax-2.0.9/pathfile.txt ...
Pathfile input found at /biotools/miniconda3/envs/constax2/opt/constax-2.0.9/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /biotools/miniconda3/envs/constax2/opt/constax-2.0.9
Memory size: 32000mb
Importing subscripts from /biotools/miniconda3/envs/constax2/opt/constax-2.0.9
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 1.753519917 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 247: vsearch: command not found
__________________________________________________________________________
Training BLAST Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 258: makeblastdb: command not found
__________________________________________________________________________
Training RDP Classifier
Error: Unable to access jarfile classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 322: blastn: command not found
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 334: vsearch: command not found
sed: can't read /database/UNITE/taxonomy_assignements/otu_taxonomy.sintax: No such file or directory
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 347: blastn: command not found
Error: Unable to access jarfile classifier
__________________________________________________________________________
Comparing to Isolates
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 373: makeblastdb: command not found
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 375: blastn: command not found
rm: cannot remove '/database/UNITE/taxonomy_assignements/unite_test_isos__BLAST.n*': No such file or directory
Combining Taxonomies
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/CombineTaxonomy.py", line 532, in <module>
open(file_name,"r")
FileNotFoundError: [Errno 2] No such file or directory: '/database/UNITE/taxonomy_assignements/otu_taxonomy.rdp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/CombineTaxonomy.py", line 534, in <module>
raise FileNotFoundError(F"{classifier.upper()} file could not be opened.")
FileNotFoundError: RDP file could not be opened.
/biotools/miniconda3/envs/constax2/opt/constax-2.0.9/constax_no_inputs.sh: line 410: Rscript: command not found
Beside you have `--make-plot' twice in your script. Do you have vsearch in the same conda environment?
Please post
conda info --envs
conda activate < your constax environment>
conda list
hi @Gian77 ,
Thanks for pointing out the error with --make_plot
, but removing the duplication has no effect on output.
Here are the outputs of the commands that you asked for:
conda info --envs
# conda environments:
#
base * /biotools/miniconda3
abricate /biotools/miniconda3/envs/abricate
aspera /biotools/miniconda3/envs/aspera
assemblers /biotools/miniconda3/envs/assemblers
bowtie2 /biotools/miniconda3/envs/bowtie2
checkm /biotools/miniconda3/envs/checkm
constax2 /biotools/miniconda3/envs/constax2
coverm /biotools/miniconda3/envs/coverm
fastp /biotools/miniconda3/envs/fastp
instrain /biotools/miniconda3/envs/instrain
iqtree /biotools/miniconda3/envs/iqtree
panacota /biotools/miniconda3/envs/panacota
parallel-fastq-dump /biotools/miniconda3/envs/parallel-fastq-dump
prodigal /biotools/miniconda3/envs/prodigal
py2 /biotools/miniconda3/envs/py2
quast /biotools/miniconda3/envs/quast
samtools /biotools/miniconda3/envs/samtools
trimal /biotools/miniconda3/envs/trimal
conda activate constax2
conda list
# packages in environment at /biotools/miniconda3/envs/constax2:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
_r-mutex 1.0.1 anacondar_1 conda-forge
binutils_impl_linux-64 2.35.1 h193b22a_2 conda-forge
binutils_linux-64 2.35 h67ddf6f_30 conda-forge
blast 2.5.0 hc0b0e79_3 bioconda
boost 1.75.0 py39h5472131_0 conda-forge
boost-cpp 1.75.0 hc6e9bd1_0 conda-forge
bwidget 1.9.14 ha770c72_0 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h7f98852_1 conda-forge
ca-certificates 2020.12.5 ha878542_0 conda-forge
cairo 1.16.0 h6cf1ce9_1008 conda-forge
certifi 2020.12.5 py39hf3d152e_1 conda-forge
constax 2.0.9 hdfd78af_0 bioconda
curl 7.76.1 h979ede3_1 conda-forge
fontconfig 2.13.1 hba837de_1005 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
fribidi 1.0.10 h36c2ea0_0 conda-forge
gcc_impl_linux-64 9.3.0 h70c0ae5_19 conda-forge
gcc_linux-64 9.3.0 hf25ea35_30 conda-forge
gettext 0.19.8.1 h0b5b191_1005 conda-forge
gfortran_impl_linux-64 9.3.0 hc4a2995_19 conda-forge
gfortran_linux-64 9.3.0 hdc58fab_30 conda-forge
graphite2 1.3.13 h58526e2_1001 conda-forge
gsl 2.6 he838d99_2 conda-forge
gxx_impl_linux-64 9.3.0 hd87eabc_19 conda-forge
gxx_linux-64 9.3.0 h3fbe746_30 conda-forge
harfbuzz 2.8.0 h83ec7ef_1 conda-forge
icu 68.1 h58526e2_0 conda-forge
jpeg 9d h36c2ea0_0 conda-forge
kernel-headers_linux-64 2.6.32 h77966d4_13 conda-forge
krb5 1.17.2 h926e7f8_0 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge
libblas 3.9.0 8_openblas conda-forge
libcblas 3.9.0 8_openblas conda-forge
libcurl 7.76.1 hc4aaa36_1 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-devel_linux-64 9.3.0 h7864c58_19 conda-forge
libgcc-ng 9.3.0 h2828fa1_19 conda-forge
libgfortran-ng 9.3.0 hff62375_19 conda-forge
libgfortran5 9.3.0 hff62375_19 conda-forge
libglib 2.68.1 h3e27bee_0 conda-forge
libgomp 9.3.0 h2828fa1_19 conda-forge
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 8_openblas conda-forge
libnghttp2 1.43.0 h812cca2_0 conda-forge
libopenblas 0.3.12 pthreads_h4812303_1 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libssh2 1.9.0 ha56f1ee_6 conda-forge
libstdcxx-devel_linux-64 9.3.0 hb016644_19 conda-forge
libstdcxx-ng 9.3.0 h6de172a_19 conda-forge
libtiff 4.2.0 hdc55705_1 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libwebp-base 1.2.0 h7f98852_2 conda-forge
libxcb 1.13 h7f98852_1003 conda-forge
libxml2 2.9.10 h72842e0_4 conda-forge
lz4-c 1.9.3 h9c3ff4c_0 conda-forge
make 4.3 hd18ef5c_1 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
numpy 1.20.2 py39hdbf815f_0 conda-forge
openjdk 8.0.282 h7f98852_0 conda-forge
openssl 1.1.1k h7f98852_0 conda-forge
pandas 1.2.4 py39hde0f152_0 conda-forge
pango 1.48.4 hb8ff022_0 conda-forge
pcre 8.44 he1b5a44_0 conda-forge
pcre2 10.36 h032f7d1_1 conda-forge
pip 21.0.1 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
python 3.9.2 hffdb5ce_0_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python_abi 3.9 1_cp39 conda-forge
pytz 2021.1 pyhd8ed1ab_0 conda-forge
r-base 4.0.3 h349a78a_8 conda-forge
rdptools 2.0.3 hdfd78af_1 bioconda
readline 8.0 he28a2e2_2 conda-forge
sed 4.8 he412f7d_0 conda-forge
setuptools 49.6.0 py39hf3d152e_3 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
sqlite 3.35.4 h74cdb3f_0 conda-forge
sysroot_linux-64 2.12 h77966d4_13 conda-forge
tk 8.6.10 h21135ba_1 conda-forge
tktable 2.10 hb7b940f_3 conda-forge
tzdata 2021a he74cb21_0 conda-forge
vsearch 2.17.0 h95f258a_1 bioconda
wheel 0.36.2 pyhd3deb0d_0 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.7.0 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-libxt 1.2.1 h7f98852_2 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.4.9 ha95c52a_0 conda-forge
Thanks again for your help
Hi @ramiroricardo,
It appears that the environment is not activated when subscripts were called, so I updated the constax_wrapper.py
script to activate the conda environment in subprocesses. You can replace your script in /biotools/miniconda3/envs/constax2/opt/constax-2.0.9/
with this new one from https://github.com/liberjul/CONSTAXv2/blob/master/constax_wrapper.py. It should already be symbolically linked to your binary directory and able to be run without additional steps.
If you rerun constax after replacing this script, does it work? Please show the output if not.
Julian
Hi all,
Sorry that I have not replied in a while, but only managed to get back to this now. I reinstalled constax with conda and am now using constax version 2.0.13. I have also tried to replace the constax_wrapper.py, but irrespectively of whether I keep the constax_wrapper.py file installed with conda or use this one, I get the same result. So when I run:
constax --num_threads 10 --mem 32000 --db /database/UNITE/sh_general_release_04.02.2020/sh_general_release_dynamic_04.02.2020.fasta --train --input /database/UNITE/unite_test_query.fasta --isolates /database/UNITE/unite_test_isos.fasta --trainfile /database/UNITE/training_files --tax /database/UNITE/taxonomy_assignements --output /database/UNITE/taxonomy_assignements --blast --make_plot --conf 0.8 --pathfile /database/UNITE/pathfile.txt
I get the errors
Welcome to CONSTAX version 2.0.13 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2021, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Overwritting previous classification...
Overwritting previous taxonomy assignments...
Performing training and overwritting training files...
Pathfile input not found in local directory ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.13-0/opt/constax-2.0.13/pathfile.txt ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.13-placeholder/opt/constax-2.0.13/pathfile.txt ...
Pathfile input found at /biotools/miniconda3/envs/constax2/opt/constax-2.0.13/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /biotools/miniconda3/envs/constax2/opt/constax-2.0.13
Memory size: 32000mb
Importing subscripts from /biotools/miniconda3/envs/constax2/opt/constax-2.0.13
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 0.9438644399999999 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
__________________________________________________________________________
Training BLAST Classifier
__________________________________________________________________________
Training RDP Classifier
Error: Unable to access jarfile classifier
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
__________________________________________________________________________
Comparing to Isolates
Combining Taxonomies
bash: activate: No such file or directory
Welcome to CONSTAX version 2.0.13 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2021, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Overwritting previous classification...
Overwritting previous taxonomy assignments...
Performing training and overwritting training files...
Pathfile input not found in local directory ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.13-0/opt/constax-2.0.13/pathfile.txt ...
Pathfile input not found at /biotools/miniconda3/envs/constax2/pkgs/constax-2.0.13-placeholder/opt/constax-2.0.13/pathfile.txt ...
Pathfile input found at /biotools/miniconda3/envs/constax2/opt/constax-2.0.13/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /biotools/miniconda3/envs/constax2/opt/constax-2.0.13
Memory size: 32000mb
Importing subscripts from /biotools/miniconda3/envs/constax2/opt/constax-2.0.13
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 0.8888717220000001 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 252: vsearch: command not found
__________________________________________________________________________
Training BLAST Classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 263: makeblastdb: command not found
__________________________________________________________________________
Training RDP Classifier
Error: Unable to access jarfile classifier
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 327: blastn: command not found
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 339: vsearch: command not found
sed: can't read /database/UNITE/taxonomy_assignements/otu_taxonomy.sintax: No such file or directory
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 352: blastn: command not found
Error: Unable to access jarfile classifier
__________________________________________________________________________
Comparing to Isolates
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 378: makeblastdb: command not found
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 380: blastn: command not found
rm: cannot remove '/database/UNITE/taxonomy_assignements/unite_test_isos__BLAST.n*': No such file or directory
Combining Taxonomies
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/CombineTaxonomy.py", line 565, in <module>
open(file_name,"r")
FileNotFoundError: [Errno 2] No such file or directory: '/database/UNITE/taxonomy_assignements/otu_taxonomy.rdp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/CombineTaxonomy.py", line 567, in <module>
raise FileNotFoundError(F"{classifier.upper()} file could not be opened.")
FileNotFoundError: RDP file could not be opened.
/biotools/miniconda3/envs/constax2/opt/constax-2.0.13/constax_no_inputs.sh: line 421: Rscript: command not found
If you have any idea on what could be leading to this, I would like to keep trying to solve it. Thanks for your help
Hi all,
I have tried to run the same code in a different virtual machine and now it does seem to run. Not sure what was the problem in the other VM. However, I am getting an error at the end. I am not sure, but it seems to me that constax is running twice, which might be leading to that final error.
So I ran:
constax \
> --num_threads 10 \
> --mem 32000 \
> --db constax2/UNITE/sh_general_release_04.02.2020/sh_general_release_dynamic_04.02.2020.fasta \
> --train \
--input constax2/UNITE/unite_test_query.fasta \
> --input constax2/UNITE/unite_test_query.fasta \
> --isolates constax2/UNITE/unite_test_isos.fasta \
> --trainfile constax2/UNITE/training_files \
> --tax constax2/UNITE/taxonomy_assignements \
> --output constax2/UNITE/taxonomy_assignements \
> --blast \
> --make_plot \
> --conf 0.8
and got:
Welcome to CONSTAX version 2.0.13 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2021, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Training, with output to constax2/UNITE/training_files...
Pathfile input not found in local directory ...
Pathfile input not found at /software/miniconda3/envs/constax2/pkgs/constax-2.0.13-0/opt/constax-2.0.13/pathfile.txt ...
Pathfile input not found at /software/miniconda3/envs/constax2/pkgs/constax-2.0.13-placeholder/opt/constax-2.0.13/pathfile.txt ...
Pathfile input found at /software/miniconda3/envs/constax2/opt/constax-2.0.13/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /software/miniconda3/envs/constax2/opt/constax-2.0.13
Memory size: 32000mb
Importing subscripts from /software/miniconda3/envs/constax2/opt/constax-2.0.13
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 0.6641772109999999 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
__________________________________________________________________________
Training BLAST Classifier
Building a new DB, current time: 06/24/2021 16:03:20
New DB name: /home/rramiro/constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__BLAST
New DB title: constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__RDP_trained.fasta
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 47741 sequences in 2.98552 seconds.
__________________________________________________________________________
Training RDP Classifier
edu.msu.cme.rdp.classifier.train.NameRankDupException: Error: duplicate taxon name and rank in the taxonomy file.
cylindrium genus 2
cenangiopsis genus 2
brevicollum genus 2
cryptococcus genus 2
aleurina genus 2
at edu.msu.cme.rdp.classifier.train.TreeFactory.creatTaxidMap(TreeFactory.java:126)
at edu.msu.cme.rdp.classifier.train.TreeFactory.<init>(TreeFactory.java:61)
at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.<init>(ClassifierTraineeMaker.java:63)
at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.main(ClassifierTraineeMaker.java:170)
at edu.msu.cme.rdp.classifier.cli.ClassifierMain.main(ClassifierMain.java:77)
RDP training error, redoing with duplicate taxa
Importing subscripts from /software/miniconda3/envs/constax2/opt/constax-2.0.13
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 0.658936341 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
RDP training error overcome, continuing with classification after SINTAX is retrained
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
__________________________________________________________________________
Comparing to Isolates
Building a new DB, current time: 06/24/2021 16:13:00
New DB name: /home/rramiro/constax2/UNITE/taxonomy_assignements/unite_test_isos__BLAST
New DB title: constax2/UNITE/taxonomy_assignements/isolates_formatted.fasta
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 10 sequences in 0.00118494 seconds.
Combining Taxonomies
____________________________________________________________________
Reformatting RDP file
Done
Reformatting SINTAX file
Done
Reformatting BLAST file
Done
Reformatting isolate result file
Done
Generating consensus taxonomy & combined taxonomy table
Done
Generating classification counts & summary table
Done
____________________________________________________________________
V1
1 OTU_ID
2 Entoloma_vindobonense|JX454802|SH1569086.08FU|refs|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__Entoloma;s__Entoloma_vindobonense
3 Entolomataceae_sp|FR682185|SH1569069.08FU|reps|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__unidentified;s__Entolomataceae_sp
4 Entoloma_pallescens|UDB025007|SH1569094.08FU|reps|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__Entoloma;s__Entoloma_pallescens
5 Entolomataceae_sp|UDB0729740|SH1569083.08FU|reps|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__unidentified;s__Entolomataceae_sp
6 Entoloma_byssisedum|UDB015478|SH1569062.08FU|refs|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__Entoloma;s__Entoloma_byssisedum
V2 V3 V4 V5 V6
1 Kingdom Phylum Class Order Family
2 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
3 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
4 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
5 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
6 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
V7 V8 V9
1 Genus Species Isolate
2 Entoloma_1 Entoloma vindobonense Entoloma_vindobonense|JX454802
3 Entoloma_1 Entolomataceae_sp|FR682185
4 Entoloma_1 Entoloma pallescens Entoloma_pallescens|UDB025007
5 Entolomataceae_sp|UDB0729740
6 Entoloma_1 Entoloma byssisedum Entoloma_byssisedum|UDB015478
V10 V11
1 Isolate_percent_id Isolate_query_cover
2 100.000 100
3 100.000 100
4 100.000 100
5 100.000 100
6 100.000 100
user system elapsed
0.003 0.000 0.003
bash: activate: No such file or directory
Welcome to CONSTAX version 2.0.13 build 0 - The CONSensus TAXonomy classifier
This software is distributed under MIT License
© Copyright 2021, Julian A. Liber, Gian M. N. Benucci & Gregory M. Bonito
https://github.com/liberjul/CONSTAXv2
https://constax.readthedocs.io/
Please cite us as:
CONSTAX2: Improved taxonomic classification of environmental DNA markers
Julian Aaron Liber, Gregory Bonito, Gian Maria Niccolò Benucci
bioRxiv 2021.02.15.430803; doi: https://doi.org/10.1101/2021.02.15.430803
Overwritting previous classification...
Overwritting previous taxonomy assignments...
Performing training and overwritting training files...
Pathfile input not found in local directory ...
Pathfile input not found at /software/miniconda3/envs/constax2/pkgs/constax-2.0.13-0/opt/constax-2.0.13/pathfile.txt ...
Pathfile input not found at /software/miniconda3/envs/constax2/pkgs/constax-2.0.13-placeholder/opt/constax-2.0.13/pathfile.txt ...
Pathfile input found at /software/miniconda3/envs/constax2/opt/constax-2.0.13/pathfile.txt ...
All needed executables exist.
SINTAX: vsearch
RDP: classifier
CONSTAX: /software/miniconda3/envs/constax2/opt/constax-2.0.13
Memory size: 32000mb
Importing subscripts from /software/miniconda3/envs/constax2/opt/constax-2.0.13
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 0.667302839 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
__________________________________________________________________________
Training SINTAX Classifier
vsearch v2.17.1_linux_x86_64, 115.3GB RAM, 20 cores
https://github.com/torognes/vsearch
Reading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasta 100%
26124492 nt in 47741 seqs, min 141, max 3526, avg 547
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Writing UDB file 100%
__________________________________________________________________________
Training BLAST Classifier
Building a new DB, current time: 06/24/2021 16:13:17
New DB name: /home/rramiro/constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__BLAST
New DB title: constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__RDP_trained.fasta
Sequence type: Nucleotide
Deleted existing Nucleotide BLAST database named /home/rramiro/constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__BLAST
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 47741 sequences in 3.02434 seconds.
__________________________________________________________________________
Training RDP Classifier
edu.msu.cme.rdp.classifier.train.NameRankDupException: Error: duplicate taxon name and rank in the taxonomy file.
cylindrium genus 2
cenangiopsis genus 2
brevicollum genus 2
cryptococcus genus 2
aleurina genus 2
at edu.msu.cme.rdp.classifier.train.TreeFactory.creatTaxidMap(TreeFactory.java:126)
at edu.msu.cme.rdp.classifier.train.TreeFactory.<init>(TreeFactory.java:61)
at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.<init>(ClassifierTraineeMaker.java:63)
at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.main(ClassifierTraineeMaker.java:170)
at edu.msu.cme.rdp.classifier.cli.ClassifierMain.main(ClassifierMain.java:77)
RDP training error, redoing with duplicate taxa
Importing subscripts from /software/miniconda3/envs/constax2/opt/constax-2.0.13
____________________________________________________________________
Reformatting database
UNITE format detected
Reference database FASTAs formatted in 0.654153012 seconds...
Training Taxonomy
Adding Full Lineage
Database formatting complete
____________________________________________________________________
RDP training error overcome, continuing with classification after SINTAX is retrained
vsearch v2.17.1_linux_x86_64, 115.3GB RAM, 20 cores
https://github.com/torognes/vsearch
Reading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasReading file constax2/UNITE/training_files/sh_general_release_dynamic_04.02.2020__UTAX.fasta 100%
26124492 nt in 47741 seqs, min 141, max 3526, avg 547
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Writing UDB file 100%
__________________________________________________________________________
Assigning taxonomy to OTU's representative sequences
vsearch v2.17.1_linux_x86_64, 115.3GB RAM, 20 cores
https://github.com/torognes/vsearch
Reading UDB file constax2/UNITE/training_files/sintax.db 100%
Reorganizing data in memory 100%
Creating bitmaps 100%
Parsing abundances 100%
26124492 nt in 47741 seqs, min 141, max 3526, avg 547
Classifying sequences 100%
Classified 10 of 10 sequences (100.00%)
__________________________________________________________________________
Comparing to Isolates
Building a new DB, current time: 06/24/2021 16:22:46
New DB name: /home/rramiro/constax2/UNITE/taxonomy_assignements/unite_test_isos__BLAST
New DB title: constax2/UNITE/taxonomy_assignements/isolates_formatted.fasta
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 10 sequences in 0.0012219 seconds.
Combining Taxonomies
____________________________________________________________________
Reformatting RDP file
Done
Reformatting SINTAX file
Done
Reformatting BLAST file
Done
Reformatting isolate result file
Done
Generating consensus taxonomy & combined taxonomy table
Done
Generating classification counts & summary table
Done
____________________________________________________________________
Loading required package: ggplot2
V1
1 OTU_ID
2 Entoloma_vindobonense|JX454802|SH1569086.08FU|refs|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__Entoloma;s__Entoloma_vindobonense
3 Entolomataceae_sp|FR682185|SH1569069.08FU|reps|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__unidentified;s__Entolomataceae_sp
4 Entoloma_pallescens|UDB025007|SH1569094.08FU|reps|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__Entoloma;s__Entoloma_pallescens
5 Entolomataceae_sp|UDB0729740|SH1569083.08FU|reps|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__unidentified;s__Entolomataceae_sp
6 Entoloma_byssisedum|UDB015478|SH1569062.08FU|refs|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Agaricales;f__Entolomataceae;g__Entoloma;s__Entoloma_byssisedum
V2 V3 V4 V5 V6
1 Kingdom Phylum Class Order Family
2 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
3 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
4 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
5 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
6 Fungi_1 Basidiomycota_1 Agaricomycetes_1 Agaricales_1 Entolomataceae_1
V7 V8 V9
1 Genus Species Isolate
2 Entoloma_1 Entoloma vindobonense Entoloma_vindobonense|JX454802
3 Entoloma_1 Entolomataceae_sp|FR682185
4 Entoloma_1 Entoloma pallescens Entoloma_pallescens|UDB025007
5 Entolomataceae_sp|UDB0729740
6 Entoloma_1 Entoloma byssisedum Entoloma_byssisedum|UDB015478
V10 V11
1 Isolate_percent_id Isolate_query_cover
2 100.000 100
3 100.000 100
4 100.000 100
5 100.000 100
6 100.000 100
user system elapsed
0.002 0.000 0.003
Error in `$<-.data.frame`(`*tmp*`, Classifier, value = c("RDP", "BLAST", :
replacement has 28 rows, data has 11
Calls: $<- -> $<-.data.frame
Execution halted
Also, in the tutorial, you show a consensus_taxonomy.txt file, which I don't see in my results. Was this replaced by the constax_taxonomy.txt output?
Best, Ramiro
Hi Ramiro,
I am currently working on a fix to some of the pathfile issues which seem to be present. Also, I need to update the tutorial to reflect that we changed consensus_taxonomy.txt
to constax_taxonomy.txt
.
It probably ran twice because an error was detected, but this error was overcome in the script. I will work to make the double run not occur in that case.
Thank you again for the feedback, and I will be posted another update shortly so you can run yours on a local machine.
Julian
It appears that running twice was not caused by the RDP duplicate taxa error, but instead something else. Could you upload the contents of the log file found in your working directory? It should be named log_constax2_<year>-<month>-<day>_<hr>-<min>-<sec>.txt
I'll also look into the error in the plotting script.
Hi Julian,
I was just testing and the --make_plot option appears to be the culprit for this to run twice. If I run constax without it, it runs once with no errors.
I am attaching two log files run with and without this option. The latest one is without the option.
log_constax2_2021-06-24_17-27-05.txt log_constax2_2021-06-24_17-28-57.txt
Thanks for sending those. I fixed the Rscript (new one below), and will push a new version once I get these path issues worked out.
###############################################
# Taxonomy Assignment Comparison #
# Gian MN Benucci #
# benucci[at]msu.edu #
###############################################
if(!require(ggplot2)){
install.packages("ggplot2")
library(ggplot2)
}
args <- commandArgs(trailingOnly=TRUE)
output_dir <- args[1]
blast <- as.logical(args[2])
format <- args[3]
comb_tax = read.table(paste(output_dir, "combined_taxonomy.txt", sep=""), header=TRUE, row.names=1, sep="\t")
head(comb_tax)
system.time(comb_tax[comb_tax==''|comb_tax==' ']<-NA)
sapply(comb_tax, function(x) sum(is.na(x))) -> unassigned_comb_tax
comb_tax_df <- as.data.frame(unassigned_comb_tax)
if (format == "UNITE"){
if (blast){
comb_tax_df$Classifier <- rep(c("RDP", "BLAST", "SINTAX", "CONSTAX"), 7)
comb_tax_df$Rank <- row.names(comb_tax_df)
comb_tax_df$Assigned <- sqrt((comb_tax_df$unassigned_comb_tax -nrow(comb_tax))^2)
comb_tax_df
comb_tax_df$Classifier <- factor(comb_tax_df$Classifier, levels = c("RDP","BLAST","SINTAX","CONSENSUS"))
} else {
comb_tax_df$Classifier <- rep(c("RDP", "SINTAX", "UTAX", "CONSTAX"), 7)
comb_tax_df$Rank <- row.names(comb_tax_df)
comb_tax_df$Assigned <- sqrt((comb_tax_df$unassigned_comb_tax -nrow(comb_tax))^2)
comb_tax_df
comb_tax_df$Classifier <- factor(comb_tax_df$Classifier, levels = c("RDP","UTAX","SINTAX","CONSENSUS"))
}
} else {
rank_count <- (dim(comb_tax)[1]-1)/3
if (blast){
comb_tax_df$Classifier <- rep(c("RDP", "BLAST", "SINTAX", "CONSENSUS"), rank_count)
comb_tax_df$Rank <- row.names(comb_tax_df)
comb_tax_df$Assigned <- sqrt((comb_tax_df$unassigned_comb_tax -nrow(comb_tax))^2)
comb_tax_df
comb_tax_df$Classifier <- factor(comb_tax_df$Classifier, levels = c("RDP","BLAST","SINTAX","CONSENSUS"))
} else {
comb_tax_df$Classifier <- rep(c("RDP", "SINTAX", "UTAX", "CONSENSUS"), rank_count)
comb_tax_df$Rank <- row.names(comb_tax_df)
comb_tax_df$Assigned <- sqrt((comb_tax_df$unassigned_comb_tax -nrow(comb_tax))^2)
comb_tax_df
comb_tax_df$Classifier <- factor(comb_tax_df$Classifier, levels = c("RDP","UTAX","SINTAX","CONSENSUS"))
}
}
pdf(paste(output_dir, "TaxonomicAssignmentComparison_plot.pdf", sep=""))
ggplot(comb_tax_df, aes(x = Rank, y = Assigned, fill= Classifier)) +
geom_bar(stat = "identity") +
scale_x_discrete(limits=comb_tax_df$Rank) +
theme(axis.text.x = element_text(angle = 90, hjust = 1),
panel.grid=element_blank(),
panel.background=element_blank()) +
#scale_fill_manual(values=mycols) +
ggtitle("Taxonomy Assignments Comparison") +
labs(x="Taxonomic Ranks", y="Number of classified OTUs") +
theme(axis.text.x = element_text(vjust=0.5, size=8)) +
theme(axis.text.y = element_text(hjust=0.5, size=8)) +
theme(plot.title = element_text(size = 15, face = "bold", hjust = 0.5))
dev.off()
Hi,
I have similar issue but my log file has a bit different error messages, namely:
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 127: blastn: command not found
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/check_input_names.py", line 8, in
rm: : No such file or directory
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/CombineTaxonomy.py", line 15, in
It is odd because outside of constax it is possible to run all of the mentioned modules.
This is the command I used: constax --num_threads 8 --db sh_general_release_dynamic_10.05.2021.fasta --trainfile training_files/ --input otus.fasta --tax taxonomy_assignements/ --output taxonomy_assignements/ --conf 0.8 --blast
And this is what the Terminal prints out:
Overwritting previous classification... Overwritting previous taxonomy assignments... Classifying without training... SINTAX executable does not match the executable used to generate the training files, if SINTAX error occurs, change your executable or use -t flag. Using the user-supplied pathfile at /opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/pathfile.txt All needed executables exist. SINTAX: vsearch RDP: classifier CONSTAX: /opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0 Memory size: 32000mb
Assigning taxonomy to OTU's representative sequences Input FASTA:
Combining Taxonomies
In the end, there are empty blast files in the taxonomy_assignements folder. The rdp_train.out file says this: The operation couldn’t be completed. Unable to locate a Java Runtime.
So I guess the issue relates to my Apple M1 chip(?) Any solution to this?
Hi @mtva0001, Thanks for reaching out with this issue. There's something wrong with how paths are being intepretted, but I don't currently have a fix to this. One possible but imperfect solution is to try installing constax in the base environment, but I understand if you don't want to do this. I'll try to get back to you soon with some potential fixes.
Hi @mtva0001,
I may have a fix, which involves changes to the constax_wrapper.py
script. You can directly overwrite the script located at /opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_wrapper.py
with the newest pushed one, downloaded from https://raw.githubusercontent.com/liberjul/CONSTAXv2/master/constax_wrapper.py. Let me know if this works!
Thanks a lot for your quick help! I did as you suggested, there is a difference this time but still not running properly:
Command I ran: constax --num_threads 8 --db sh_general_release_dynamic_10.05.2021.fasta --trainfile ./training_files/ --input otus.fasta --tax ./taxonomy_assignements/ --output ./taxonomy_assignements/ --conf 0.8 --blast --make_plot
The log file: usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]] [-e pattern] [-f file] [--binary-files=value] [--color=when] [--context[=num]] [--directories=action] [--label] [--line-buffered] [--null] [pattern] [file ...] vsearch v2.16.0_macos_x86_64, 16.0GB RAM, 8 cores https://github.com/torognes/vsearch
Reading file ./training_files//sh_general_release_dynamic_10.05.2021__UTAX.fasta 100%
29627932 nt in 58440 seqs, min 140, max 4921, avg 507
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Writing UDB file 100%
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 267: makeblastdb: command not found
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 331: blastn: command not found
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/check_input_names.py", line 8, in
Reading UDB file ./training_files//sintax.db 100% Reorganizing data in memory 100% Creating bitmaps 100% Parsing abundances 100% 29627932 nt in 58440 seqs, min 140, max 4921, avg 507
Fatal error: Unable to open file for reading ()
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/split_inputs.py", line 20, in
rm: : No such file or directory
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/CombineTaxonomy.py", line 15, in
Hi @mtva0001, I pushed new versions of the constax_wrapper.py
and constax_no_inputs.sh
scripts. At least one of your errors could be traced to OSX-incompatible grep -P commands which have been replaced. Try downloading and replacing these scripts in your /opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/
directory and let me know how it goes.
Also, it would be helpful to see the packages installed both inside and outside of your CONSTAX environment. You can print these with conda list
, and upload them as files because they made be very long.
Hi! Thanks again for your quick response! This time I got this:
vsearch v2.16.0_macos_x86_64, 16.0GB RAM, 8 cores https://github.com/torognes/vsearch
Reading file ./training_files//sh_general_release_dynamic_10.05.2021__UTAX.fasta 100%
29627932 nt in 58440 seqs, min 140, max 4921, avg 507
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Writing UDB file 100%
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 267: makeblastdb: command not found
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 331: blastn: command not found
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/check_input_names.py", line 8, in
Reading UDB file ./training_files//sintax.db 100% Reorganizing data in memory 100% Creating bitmaps 100% Parsing abundances 100% 29627932 nt in 58440 seqs, min 140, max 4921, avg 507
Fatal error: Unable to open file for reading ()
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/split_inputs.py", line 20, in
rm: : No such file or directory
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/CombineTaxonomy.py", line 15, in
Conda lists: conda_list.txt
Hello @mtva0001,
I made an additional edit to the constax_wrapper.py
script to hopefully fix the module load errors. I am still not sure why blastn
, java
, and other non-python commands fail, but will hopefully figure it out soon. I also update the ComparisonBars.R
script to fix the package installation error. Update the script using the same link as above. https://github.com/liberjul/CONSTAXv2/issues/3#issuecomment-950051122
Hi,
Sorry for the late reply. I just tried it but it gives me the same error:
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 256: vsearch: command not found
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 267: makeblastdb: command not found
/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/constax_no_inputs.sh: line 331: blastn: command not found
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/check_input_names.py", line 8, in
rm: : No such file or directory
Traceback (most recent call last):
File "/opt/anaconda3/envs/CONSTAX/opt/constax-2.0.15-0/CombineTaxonomy.py", line 15, in
Dear all,
I have a similar issue. I was running:
constax \ --num_threads 12 \ --mem 32000 \ --db /home/groups/fukamit/ytwu/Wu_CA_oak_pilot/02_Analysis/itsxpress_dada2_FungalTraits/sh_general_release_dynamic_29.11.2022.fasta \ --train \ --input /home/groups/fukamit/ytwu/Wu_CA_oak_pilot/02_Analysis/itsxpress_dada2_FungalTraits/ASV_nochim_fun.fasta \ --isolates /home/groups/fukamit/ytwu/Wu_CA_oak_pilot/02_Analysis/itsxpress_dada2_FungalTraits/sh_general_release_dynamic_29.11.2022.fasta \ --trainfile training_files/ \ --tax taxonomy_assignements/ \ --output taxonomy_assignements/ \ --conf 0.8 \ --blast \ --pathfile pathfile.txt
And here is the error from the log file:
Traceback (most recent call last):
File "/home/groups/fukamit/ytwu/software/miniconda3/envs/constax/opt/constax-2.0.18-0/CombineTaxonomy.py", line 576, in
I will greatly appreciate some help! Thank you in advance.
This is how my fasta file looks like. Is the problem caused by how I name them? Should I change all of them to OTU_xx?
HI @YingtongAamandaWu ,
I don't believe the sequence headers are the issues, but instead the RDP classification output may not be consistent with the expected format. Can you upload taxonomy_assignements/otu_taxonomy.rdp
?
@liberjul
Thanks for the timely response. From my side, the otu_taxonomy.rdp is an empty file:
. I am sending the whole taxonomy_assignements folder and the log file, so that you can check the details. Thank you again!
BTW, rdp_train.out is also an empty file from my side.
taxonomy_assignements.zip
log_constax2_2022-12-30_13-01-03.txt
Hi @YingtongAamandaWu,
It appears that one of the files produced by RDP when training was not present at the time of classification. You will need to retrain the classifier, using -t/--train
. It is possible that if you trained the classifiers earlier the RDP training failed, usually due to not enough memory. I have not yet trained on the newest release so I am unsure of the memory requirement, but I would estimate that 64 GB would be sufficient.
Yes, that was exactly why. I used 8GB at first, and then I used 128GB and it ran. Thank you for helping!
Dear all,
I have been trying to test Constax, but I have been facing a problem, which I think might be related to the pathfile, but I have not been able to figure out a solution. I installed constax in a conda environment using the commands given in the instructions and I have constax v2.0.9 installed in a Ubuntu server, v 18.04.5 LTS.
I have tried to run constax both with and without indicating the pathfile that is located in the conda environment (the result is the same). My code looks like:
and I get the following output
Perhaps there is some simple mistake that I am making?
thanks for any help