Open ramnageena11 opened 1 year ago
Hi All, Pls comment and suggest.
Thanks RNS
What is the output of command -v dustmasker
?
Pls find attached the screenshot of the status: [image: image.png]
pls suggest. shall I kill the command? Thanks rgds Ram Ram Nageena Singh, Ph.D (Microbiology)
On Tue, Aug 2, 2022 at 7:49 AM fanninpm @.***> wrote:
What is the output of command -v dustmasker?
— Reply to this email directly, view it on GitHub https://github.com/DaehwanKimLab/centrifuge/issues/242#issuecomment-1202610281, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK4LETUGHCUFPH4XCRGFY63VXEROBANCNFSM54ISFHAA . You are receiving this because you authored the thread.Message ID: @.***>
The attachment was scrubbed. Please log in to GitHub to attach it. Alternatively, you can copy and paste the text output from the terminal (the trick is to add the SHIFT key when copying/pasting from a terminal application).
Hi, I have made a query thread on Github page (#242). Pls see that.
Thanks rgds Ram Ram Nageena Singh, Ph.D (Microbiology)
On Tue, Aug 2, 2022 at 11:33 AM fanninpm @.***> wrote:
The attachment was scrubbed. Please log in to GitHub to attach it. Alternatively, you can copy and paste the text output from the terminal (the trick is to add the SHIFT key when copying/pasting from a terminal application).
— Reply to this email directly, view it on GitHub https://github.com/DaehwanKimLab/centrifuge/issues/242#issuecomment-1203021499, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK4LETUUWMBCFOMYQVB75SDVXFLVVANCNFSM54ISFHAA . You are receiving this because you authored the thread.Message ID: @.***>
Hi, PLs see the below (Terminal status). Now it is more than 15 days.
Progress : [#######---------------------------------] 19% 5462/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5463/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5464/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5465/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5466/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5467/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5468/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5469/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5470/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5471/27880environment: line 28: dustmasker: command not found Progress : [#######---------------------------------] 19% 5472/27880environment: line 28: dustmasker: command not found
Feel free to kill the process and add dustmasker.
Hi, the following command is running centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
is there anyother script to add dustmasker?
Thanks RNS
How did you install centrifuge?
Hi, I did installation using:
conda install -c bioconda centrifuge Collecting package metadata (current_repodata.json): done Solving environment: done
environment location: /home/majorram/anaconda3/envs/diversity
added / updated specs:
The following packages will be downloaded:
package | build
---------------------------|-----------------
centrifuge-1.0.4_beta |py36pl526he941832_2 3.9 MB bioconda
------------------------------------------------------------
Total: 3.9 MB
The following NEW packages will be INSTALLED:
centrifuge bioconda/linux-64::centrifuge-1.0.4_beta-py36pl526he941832_2 perl conda-forge/linux-64::perl-5.26.2-h36c2ea0_1008
Proceed ([y]/n)? y
Downloading and Extracting Packages centrifuge-1.0.4_bet | 3.9 MB | ################################################################################################################################################################ | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done
If you want to make a database from scratch, then you also need BLAST, Jellyfish, and MUMmer.
# after activating your conda virtual environment
conda install -c bioconda blast
conda install -c bioconda kmer-jellyfish
conda install -c bioconda mummer
However, if you want to use a pre-built database, then you don't need those three pieces of software.
(By the way, if you don't mind working with a database a few years out of date, Ben Langmead has a GitHub Pages website with links to pre-built Centrifuge databases.)
Hi, in Conda (base environment) i have blast and in separate environment mummer. will install these in the current environment.
I will prefer with updated database. thanks for suggestion.
RNS
shall i run this script centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
after installing all three softwares?
what was the issue with "dustmaker command not found" ?
thanks RNS
what was the issue with "dustmaker command not found" ?
In order to build a database from scratch, the dustmasker
tool is necessary. The database builder couldn't find the dustmasker
command in $PATH
, so it printed that warning to the screen.
You may have noticed that "dustmasker" was not in the names of the three packages I mentioned. The dustmasker
command is found in the blast
package.
shall i run this script
centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
after installing all three softwares?
You can. However, it may be slightly more convenient to use the Makefile. If make
is not accessible from your environment, you can install it from conda-forge
:
conda install -c conda-forge make
To see what the Makefile can do, you can invoke it without setting any of its options (note that the -C
flag tells make
where the Makefile is in your conda environment):
make -C "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices
You might be looking for something like this, which is similar to the p_compressed+h+v
database:
make -C "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices \
THREADS=0 IDX_NAME='p_compressed+v' ANY_LEVEL_GENOMES='viral' COMPLETE_GENOMES_COMPRESSED='archaea bacteria'
(Make sure to specify the amount of threads you're working with.)
I am new to the environment. thanks for explaining it.
I have "make" in my environment.
make makeconv makembindex make-ssl-cert
makeblastdb make-first-existing-target makeprofiledb mako-render
Do i need to specify the? dirname= command= in make -C "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices
Thanks RNS
dirname
is a command that is part of the GNU coreutils. If you have a question for what it does, try running man dirname
.
With the power of command substitution, I use the dirname
command several times to get the location of the conda environment. You can try it yourself:
command -v centrifuge
dirname $(command -v centrifuge)
dirname $(dirname $(command -v centrifuge))
I used this to help the make
command find the appropriate Makefile
. If that -C
flag wasn't specified, and if there isn't a Makefile
in the current working directory, make
lets you know that it can't do anything:
$ make
make: *** No targets specified and no makefile found. Stop.
thanks for explaining. I appreciate.
I have run the script make -C "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices \ THREADS=0 IDX_NAME='p_compressed+v' ANY_LEVEL_GENOMES='viral' COMPLETE_GENOMES_COMPRESSED='archaea bacteria'
with 20 threads.
Good luck. It may still take a long time.
This error is coming:
Error downloading https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/839/865/GCF_000839865.1_ViralProj14134 /GCF_000839865.1_ViralProj14134 _genomic.fna.gz!
Error downloading https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/867/225/GCF_000867225.2_ViralMultiSegProj16738 /GCF_000867225.2_ViralMultiSegProj16738 _genomic.fna.gz! basename: extra operand ‘_genomic.fna.gz’ Try 'basename --help' for more information. basename: extra operand ‘_genomic.fna.gz
People have encountered this problem in the past (see #201).
Here's my attempt at fixing it (that may or may not work):
pushd "$(dirname $(command -v centrifuge-download))"
if command -v curl &> /dev/null; then
curl https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff > patch.diff
elif command -v wget &> /dev/null; then
wget -O patch.diff https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff
fi
if [[ -f patch.diff ]]; then
patch -p0 <patch.diff
else
echo "didn't download patch!"
fi
popd
A few notes:
pushd
and popd
are shell built-in commands that are a bit like cd
but also manipulate the directory stack. (You can use the dirs
command to see what's currently in the directory stack.).patch
or .diff
for Git's plaintext views. Here' I'm using it on the page that compares the master
branch to the v1.0.4 release.curl
or wget
(or neither), so I used Bash's built-in control flow "commands" to prepare for every eventuality. If you are using another shell (such as zsh
), feel free to adapt the control flow for that purpose. (Confused about bash? I'd recommend taking time to read through some of man bash
. Be aware that Bash's manpage is really long, so you may want to use the /
key to search for certain keywords.)Also, you may want to use make -C "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices clean
to clean up dirty directories.
shall i wait for ongoing script to stop or kill that? Thanks
I'd recommend killing it, then cleaning up what it generated so far.
ok Thanks
will do it, and proceed as you suggested,
RNS
Hi, I have wget but not curl.
I ran the whole script but got another error: pushd "$(dirname $(command -v centrifuge-download))" if command -v curl &> /dev/null; then curl https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff > patch.diff elif command -v wget &> /dev/null; then wget -O patch.diff http://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff fi if [[ -f patch.diff ]]; then patch -p0 <patch.diff else echo "didn't download patch!" fi popd bash: syntax error near unexpected token `then'
Did the patch download? I'm trying to isolate where my attempt went wrong.
No, nothing happened. Is the below script is single or 3/4 scripts?
pushd "$(dirname $(command -v centrifuge-download))" if command -v curl &> /dev/null; then curl https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff > patch.diff elif command -v wget &> /dev/null; then wget -O patch.diff https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff fi if [[ -f patch.diff ]]; then patch -p0 <patch.diff else echo "didn't download patch!" fi popd
What's the output of dirs
?
Did not get your question?
I ran the script as a single command and got the error of
bash: syntax error near unexpected token `then'
Run dirs
. What is printed to the screen? (This is so I can determine the current working directory and the directory stack.)
this is output
(diversity) majorram@majorram-gilbert:~$ dirs ~
I think I know what went wrong. When you copy/paste into your terminal, somehow the newlines are lost.
Command 1: Let's switch to the directory that contains centrifuge-download
.
pushd "$(dirname $(command -v centrifuge-download))"
Command 2 (simplified from last time): Let's download the patch from GitHub.
wget -O patch.diff https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff
Command 3 (simplified from last time): Let's apply the patch that we just downloaded.
patch -p0 <patch.diff
Command 4: Let's get back to where you were before.
popd
okay Thanks RNS
Output for Command1:
$pushd "$(dirname $(command -v centrifuge-download))" ~/anaconda3/envs/diversity/bin ~
output command 2: wget -O patch.diff https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff --2022-08-04 13:05:03-- https://github.com/DaehwanKimLab/centrifuge/compare/v1.0.4...master.diff Resolving github.com (github.com)... 140.82.113.4 Connecting to github.com (github.com)|140.82.113.4|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1951 (1.9K) [text/plain] Saving to: ‘patch.diff’
patch.diff 100%[===============================================================================================================>] 1.91K --.-KB/s in 0s
2022-08-04 13:05:03 (25.7 MB/s) - ‘patch.diff’ saved [1951/1951]
File to patch:
I was afraid of that. Type
centrifuge-download
at that prompt.
after centrifuge-download
File to patch: centrifuge-download patching file centrifuge-download Hunk #1 succeeded at 363 (offset 1 line). finished
Then you can use popd
to get back to where you were before.
done
what should i do next?
Try that whole make
command again.
ok
Hi, Pls see this:
make -C "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices THREADS=24 IDX_NAME='p_compressed+v' ANY_LEVEL_GENOMES='viral' COMPLETE_GENOMES_COMPRESSED='archaea bacteria fungi' make: Entering directory '/home/majorram/anaconda3/envs/diversity/share/centrifuge/indices' mkdir -p reference-sequences [[ -d tmp_p_compressed+v ]] && rm -rf tmp_p_compressed+v; mkdir -p tmp_p_compressed+v Downloading and dust-masking viral centrifuge-download -o tmp_p_compressed+v -m -a "Any" -d "viral" -P 24 refseq > \ tmp_p_compressed+v/all-viral-any_level.map Downloading ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/viral/assembly_summary.txt ... Downloading 11699 viral genomes at assembly level Any ... (will take a while) dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory Progress : [----------------------------------------] 0% 2/11699dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory Progress : [----------------------------------------] 0% 7/11699dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory
make -C "$(dirname (command -v centrifuge)))"/share/centrifuge/indices THREADS=24 IDX_NAME='p_compressed+v' ANY_LEVEL_GENOMES='viral' COMPLETE_GENOMES_COMPRESSED='archaea bacteria fungi'
Does it anything with " archaea bacteria fungi"? I have added fungi here also
What happens when you kill the previous invocation, and you only make a viral database?
make -f "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices/Makefile THREADS=0 v
I have a hunch as to why you might be getting the error with libssl.so.1.0.0
.
I find that the simplest way to solve this kind of problem is by re-making your Conda environment from scratch using a YAML file. Here is an example for that YAML file:
name: rename-me-with-whatever-you-want
channels:
- conda-forge
- defaults
- bioconda
dependencies:
- centrifuge
- blast
- kmer-jellyfish
- mummer
- make
(Please note that the order of channels matters. conda-forge
needs to be specified first in order to avoid specific cryptic error messages.)
Installing mamba may also help with some dependency resolution issues, as mamba has a faster and more robust dependency resolver than conda.
CAUTION: after this, you will have to redo those patching steps I guided you through earlier.
What happens when you kill the previous invocation, and you only make a viral database?
make -f "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices/Makefile THREADS=0 v
PLs see the output: ake -f "$(dirname $(dirname $(command -v centrifuge)))"/share/centrifuge/indices/Makefile THREADS=20 v Making: v: v make -f /home/majorram/anaconda3/envs/diversity/share/centrifuge/indices/Makefile IDX_NAME=v make[1]: Entering directory '/home/majorram' mkdir -p reference-sequences [[ -d tmp_v ]] && rm -rf tmp_v; mkdir -p tmp_v Downloading and dust-masking viral centrifuge-download -o tmp_v -m -a "Any" -d "viral" -P 20 refseq > \ tmp_v/all-viral-any_level.map Downloading ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/viral/assembly_summary.txt ... Downloading 11699 viral genomes at assembly level Any ... (will take a while) dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory Progress : [----------------------------------------] 0% 1/11699dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory Progress : [----------------------------------------] 0% 2/11699dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory Progress : [----------------------------------------] 0% 3/11699dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory dustmasker: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory Progress : [----------------------------------------] 0% 4/11699dustmasker: error while loading shared libraries:
Hi, I executed the below command to download Centrifuge database: centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
The command is running from last 7 days. following status is Progress : [##-----------------------] 5% 1525/27880environment: line 28: dustmasker: command not found Progress : [##-----------------------] 5% 1526/27880environment: line 28: dustmasker: command not found Progress : [##-----------------------] 5% 1527/27880environment: line 28: dustmasker: command not found Progress : [##-----------------------] 5% 1528/27880
Pls suggest, do i need to kill the command or is it fine?
Thanks RNS