blobtoolkit / blobtoolkit

Interactive quality assessment of genome assemblies
http://blobtoolkit.genomehubs.org
MIT License
82 stars 10 forks source link

Add taxonomy from version 1 in version 2? #27

Open nicolereynolds1 opened 3 years ago

nicolereynolds1 commented 3 years ago

Hi,

I have tried the add taxonomy step multiple different ways and it doesn't seem to work, even though no errors are reported.

blobtools add --hits Coemansia_sp._RSA_376.ncbi.blastn.out --hits Coemansia_sp._RSA_376.diamond.blastx.out --taxrule bestsumorder --taxdump /bigdata/stajichlab/shared/projects/ZyGoLife/Kickxello_Coemansia/taxdump /bigdata/stajichlab/shared/projects/ZyGoLife/Kickxello_Coemansia/genomes/Coemansia_sp._RSA_376

How can I tell why the command doesn't work properly? I saw in issue 11 that there should be multiple json taxonomy files output, but I don't get any of those.

I am trying to use taxonomy output files generated with BlobTools version 1, so I am not sure if the two versions are compatible.

Thanks

rjchallis commented 3 years ago

Hi

You should be able to add the hits files, but you will need to make sure the --hit-cols parameter matches the columns in your blast output. For the blastn, it is likely that you have all the required fields but for blastx, the blobtools1 instructions don't include the taxon IDs directly and you would need to run the blobtools1 taxify subcommand to get that info.

The default is equivalent to:

blobtools add --hits ... \
                        --hits-cols 1=qseqid,2=staxids,3=bitscore,5=sseqid,10=sstart,11=send,14=evalue \
                        ...

You should still be able to import files without sstart and send, but you will need to make sure the column order is the same in both files for the import to work.

Hope this helps