billzt / MiFish

This is the command line version of MiFish pipeline. It can also be used with any other eDNA meta-barcoding primers
https://mitofish.aori.u-tokyo.ac.jp/mifish/
GNU General Public License v3.0
13 stars 3 forks source link

Where does the data included in the Mitofish database come from? #12

Closed zhangjl-work closed 6 months ago

zhangjl-work commented 6 months ago

Where does the data included in the Mitofish database come from? A detailed description would be appreciated

billzt commented 6 months ago

Hello @zhangjl-work , it comes from the GenBank database (https://ncbi.nlm.nih.gov/nuccore).

zhangjl-work commented 6 months ago

Hello @zhangjl-work , it comes from the GenBank database (https://ncbi.nlm.nih.gov/nuccore).

thanks, and i have a question. The data in the mitofish, does it contain the data of the fish 12s in the ncbi?

billzt commented 6 months ago

Do you mean the data at https://mitofish.aori.u-tokyo.ac.jp/download/ ?

zhangjl-work commented 6 months ago

您是指https://mitofish.aori.u-tokyo.ac.jp/download/上的数据吗?

I mean, the data on https://mitofish.aori.u-tokyo.ac.jp/download/ is the same as the data I'm using with docker run --rm -it \ -v /Users/zjl/Desktop/MiFish/test/verify/ncbi/:/data \ --workdir="/data" \v quay.io/swordfish/crabs:0.1.4 \ crabs db_download \ ---source ncbi \ --database nucleotide \ --query '12S[All Fields] AND ("1"[SLEN] : "50000"[SLEN])' What is the difference between the data downloaded by the methods

billzt commented 6 months ago

'12S[All Fields] AND ("1"[SLEN] : "50000"[SLEN])' contains 12S rRNA sequences from all species, including fish and non-fish species. While data from MitoFish includes all sequences (e.g. 12S, 16S, COX1, and so on) from fish species

zhangjl-work commented 6 months ago

所有者 Thank you for your prompt reply. I have another question, the sequences in MitoFish are inclusive of all the sequences of the fish 12s in ncbi, right?
My command, docker run --rm -it -v /Users/zjl/Desktop/MiFish/test/verify/ncbi/:/data --workdir="/data" \v quay.io/swordfish/crabs:0.1.4 crabs db_download ---source ncbi --database nucleotide --query '12S[All Fields] AND ("1"[SLEN] : "50000"[SLEN])' If I add another --species parameter and want to download the fish 12S sequence. What should I put after this --species parameter? Is it a series of fish species names, or what? Thank you very much for your guidance!

billzt commented 6 months ago

Thank you for your prompt reply. I have another question, the sequences in MitoFish are inclusive of all the sequences of the fish 12s in ncbi, right?

Yes. Sequences in MitoFish includes all the sequences of the fish 12S rRNA in NCBI.

If I add another --species parameter and want to download the fish 12S sequence. What should I put after this --species parameter? Is it a series of fish species names, or what?

I guess it should be --species 'Chondrichthyes + Dipnomorpha + Actinopterygii + Myxini + Hyperoartia + Coelacanthimorpha', but I haven't tried the CRABS software. Please report on https://github.com/gjeunen/reference_database_creator if you encounter any other issues on CRABS.