lmc297 / BTyper3

In silico taxonomic classification of Bacillus cereus group genomes using whole-genome sequencing data
GNU General Public License v3.0
19 stars 0 forks source link

Feature Request: store downloaded db files #11

Open chrisgulvik opened 1 year ago

chrisgulvik commented 1 year ago

To enable further follow-up work on the exact files used for comparison, for example for the genomes using ANI, could a feature be added to specify an outpath where the files are stored?

chrisgulvik commented 1 year ago

This is a hackish workaround to accomplish it for type strains for the time being.

git clone git@github.com:lmc297/BTyper3.git
mkdir -p Btyper3/seq_ani_db/typestrains
awk '{print $1 "\t" $3}' \
 Btyper3/seq_ani_db/typestrains/typestrains.tsv \
 > BTyper3/archive/scripts/seq_ani_db/typestrains/typestrains.txt
sed -i  '1 s/./#&/' BTyper3/archive/scripts/seq_ani_db/typestrains/typestrains.txt

cd Btyper3/archive/scripts
./build_btyper3_ani_db.py -db typestrains-only
for f in *.gz.gz; do
  [ -f "$f" ] && mv -v "$f" "${f%.gz.gz}.gz"
done
ls B_*.fna.gz | wc -l
    # RESULTS:  n=29 files
althonos commented 1 year ago

The data files are installed with the rest of the Python code, so you should can find the path with the following command:

$ python -c 'import btyper3; print(btyper3.__path__[0])'
/home/althonos/.local/lib/python3.11/site-packages/btyper3