Closed pinin4fjords closed 7 years ago
Hi Jon,
I believe there is. There are actually two ways to deal with this:
a) You can make a file of all contig_ids that have not been blasted (see below) and provide that file with the --catcolour
.option to blobtools plot
or covplot
contig_2,not_blasted
contig_10,not_blasted
contig_15,not_blasted
...
b) Just generate a 'hits' file with the contig_ids you did not blast. You can specify as TaxID '0' which is 'root' in the NCBI taxonomy (and will result in all taxonomic ranks set to 'undef') and a score. E.g:
contig_2\t0\t1000
contig_10\t0\t1000
contig_15\t0\t1000
...
You can then provide that file as an additional 'hits' file and when doing blobtools view -i blobDB --hits -r all
you can recognise those by being undef
at all ranks or based on the fact that they only got hits from that 'hits' file. However, if you want to plot them you still have to give it a catcolour file since otherwise they will be binned with other contigs that are annotated as 'undef' because of NCBI taxonomy (that happens if a taxon has not taxonomy at a given rank).
Let me know if this helps.
cheers,
dom
Right now we only care about the plots (blob and read coverage), so it'll be option a) I think- thanks for the quick response.
Jon
Hi,
For resource conservation we don't BLAST short contigs from our de novo assemblies. The consequence when using BAM files with reads mapped against a reference that does contain those contigs is Blob plots with grey clouds containing both contigs that weren't BLASTed and contigs that were BLASTed but produced no hits (and uninformative bars in the ReadCovPlot).
My proposal to deal with this is an additional parameter to specify the contigs we acutally BLASTed, producing plots with separate BLASTED and non-BLASTED no-hits categories. I'm happy to have a go at coding that if necessary. But is there a better way?
Thanks,
Jon