Closed cement-head closed 9 months ago
@cement-head
Hello.
There are two issues.
The second parameter should be the name of database, not the name of directory. In this example, it should be /home/cbfgws6/MiFish/mifishdb/mito-all.fa
, not /home/cbfgws6/MiFish/mifishdb/
. The README seems confusing, and I have modified it.
The database mito-all.fa
is a collection of all fish's mitochondrial sequences. However this MiFish pipeline requires a database of amplicon sequence. So, mito-all.fa
is not suitable in this situation.
Following is resolving methods:
./test/mifishdbv3.83.fa
in this repository (outdated).mito-all.fa
(step 1~6, using MitoFish as original source), then using the awk command to change it to FASTA format.$ awk '{print ">gb|" $1 "|" $9 "\n" $10}' output.tsv >your.db.fa
$ makeblastdb -dbtype nucl -in your.db.fa
Okay, so just to clarify: (1) Install CRABS (2) Download mitofish DB using CRABS (Step 1.4) (3) Download NCBI Taxonomy database (Step 1.5) (4) Use db_import to import the mitofish db into CRABS (Step 2) (5) To extract the amplicon sequences, should I use Step 4.1 or 4.2, or both? (6) Assign TAXA (Step 5) (7) Dereplicate the database.
Then, use the commands above to change the <.tsv> file to a <.fa> file, and make the database using makeblastdb.
Have I got that right?
Yes, overall that's right except for:
(4) is not necessary. (4) is used for in-house generated or curated data.
(5) Both Step 4.1 and 4.2 are recommended.
In case anyone else needs an updated MitoFish DB, here's one made October 1st, 2023. mitofish-db-October2023.tar.gz
Well, that didn't work:
BLAST Database error: Error: Not a valid version 4 database.
Traceback (most recent call last):
File "/home/cbfgws6/miniconda3/envs/MiFish/bin/mifish", line 33, in <module>
sys.exit(load_entry_point('mifish', 'console_scripts', 'mifish')())
File "/home/cbfgws6/MiFish/mifish/cmd/mifish.py", line 76, in main
pipeline.runMiFish(data_dir=args.seq_dir, data_dir_other_groups=data_dir_other_groups, \
File "/home/cbfgws6/MiFish/mifish/core/pipeline.py", line 247, in runMiFish
for blast_record in NCBIXML.parse(handle):
File "/home/cbfgws6/miniconda3/envs/MiFish/lib/python3.9/site-packages/Bio/Blast/NCBIXML.py", line 799, in parse
raise ValueError("Your XML file was empty")
ValueError: Your XML file was empty
Nevermind, WAY too many versions of blastn on my machine - version conflict.
In case anyone else needs an updated MitoFish DB, here's one made October 1st, 2023. mitofish-db-October2023.tar.gz
Where does the data included in the Mitofish database come from? A detailed description would be appreciated
I downloaded the entire database from the site: http://mitofish.aori.u-tokyo.ac.jp/species/detail/download/?filename=download%2F/complete_partial_mitogenomes.zip
Then I used this command
I then attempt to run the pipline and I get this error:
What am I doing wrong?