faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
80 stars 49 forks source link

Error with match contigs to probe: using scafSeq files as input #206

Closed nonnohasegawa closed 4 years ago

nonnohasegawa commented 4 years ago

Hello!

I'm a newbie in bioinformatics and have been recently tackling my way through phyluce Im trying to use match contig to probe function, in which requires the --contig input. I thought I could use scafseq files produced from mitofinder -> [https://github.com/RemiAllio/MitoFinder]

however I've been getting the same error that it cannot create the database. I'm wondering if you or anyone have tried to use .scafseq files to proceed with the UCE mining using phyluce, any ideas would help!

p.s. i already tried renaming the input files and still have been getting the same error.

script:

IN_DIR="/flash/BourguignonU/Nonno/Phyluce/scafseq"
PROBE="/flash/BourguignonU/Nonno/Phyluce/uce-loci/termite-master.fasta"
phyluce_assembly_match_contigs_to_probes \
        --contigs ${IN_DIR} \
        --probes $PROBE \
        --output uce-search-results \
        --log-path log

error:

2020-11-06 16:04:13,217 - phyluce_assembly_match_contigs_to_probes - CRITICAL - Database already exists
2020-11-06 16:04:13,217 - phyluce_assembly_match_contigs_to_probes - CRITICAL - Cannot create database

sample names: A1449-link-metaspades.scafSeq (increments in number only)

brantfaircloth commented 4 years ago

The files produces by MitoFinder should be just fine. The issue you may have could relate to your sample names - sqlite (the database) does not like dashes - in file names or really any characters other than letters or underscores. Additionally, the assembly files output by MitoFinder should be renamed to <name>.contigs.fasta.

nonnohasegawa commented 4 years ago

I got it working by renaming that way! thank you so much