DRL / blobtools

Modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets
GNU General Public License v3.0
184 stars 44 forks source link

No Hits #18

Closed bibilujan closed 8 years ago

bibilujan commented 8 years ago

Hi,

I am interested in running this tool to possibly visualize and pull out some symbionts from my sample. I was able to use bloobtools to and get the graph and table with view and blobplot commands, but I get all my entries as "no hits". Would you be able to help me to troubleshoot this issue? My assembly is of a 260MB plant, closely related to A. thaliana so there should be lots of hits that are properly annotated.

Here is the summary stats:

C.microcarpa-plot-4.C.microcarpa.BlobDB.json.span.phylum.p7.100.bestsum - spades Group colour count visible (%) span visible(%) n50 GC GC (std) cov_mean cov_std read map read map (%) all None 49,622 100.0% 203,388,998 100.0% 107,031 0.41 0.1 25.2 172.8 0 0.0% no-hit #d3d3d3 49,622 100.0% 203,388,998 100.0% 107,031 0.41 0.1 25.2 172.8 0 0.0%

And a few lines from the "view" output (the last is a hit to A. thaliana):

NODE_1_length_1111123_cov_10.2151_ID_114688196 6429 gi|727511611|ref|XM_010433800.1| 97.47 3795 44 29 708277 712041 3773 1 0.0 NODE_1_length_1111123_cov_10.2151_ID_114688196 5991 gi|727522685|ref|XM_010439039.1| 99.57 3286 14 0 992226 995511 6183903 0.0 NODE_1_length_1111123_cov_10.2151_ID_114688196 5485 gi|727547328|ref|XM_010448413.1| 95.46 3483 80 44 708279 711744 3425 4 0.0 NODE_1_length_1111123_cov_10.2151_ID_114688196 4427 gi|7270623|emb|AL161590.2| 87.72 3958 280 121 708239 712136 166312 170123 0.0

DRL commented 8 years ago

Hi bibilujan,

your blast file has to be in the format '6 qseqid staxids bitscore ...', while your blast results seem to be in the format '6 qseqid staxids subject ...'

I recommend re-running the blast with :

-outfmt '6 qseqid staxids bitscore std sscinames sskingdoms stitle'

that should fix the issue

cheers,

dom

bibilujan commented 8 years ago

Hi Dom,

The results you see I actually got after running the blast in the way you are specifying, here is the command:

blastn -task megablast -query ../Assembly_C.microcarpa/Old_Assemblies/C.microcarpa-m13c-filtered-200.fasta -db nt-ref-db/nt -outfmt '6 qseqid staxids bitscore std sscinames sskingdoms stitle' -max_target_seqs 25 -culling_limit 2 -num_threads 16 -evalue 1e-25 -out C.microcarpa13.vs.nt.25cul1.1e25.megablast.out

I deleted the third row from my blast results and re-ran blobplot, and I did get hits this time, however im not sure why this is happening or if I can trust these results. What do you think?

Thank you for taking the time to look at this,

Beatriz

c microcarpa-plot c microcarpa blobdb json span phylum p7 100 bestsum blobs spades C.microcarpa-plot.C.microcarpa.BlobDB.json.span.phylum.p7.100.bestsum.stats.txt

DRL commented 8 years ago

Hi Beatriz,

nice blobplot! :)

several things:

Regarding the plot:

hope this helps,

dom

bibilujan commented 8 years ago

Here you can find the blast results, this is before removing the third row:

https://www.dropbox.com/s/l9iwvc3i9wdjulb/C.microcarpa_blast_results.tar?dl=0

Is it common to have so much proteobacteria? I was hoping to find some fungi in there. I did re-download the most recent nt database and taxonomy, because I thought that was why I was not getting any hits initially.

Thanks,

Beatriz

bibilujan commented 8 years ago

I had placed a damaged file on dropbox, but I have replaced it with the correct one now.

https://www.dropbox.com/s/l9iwvc3i9wdjulb/C.microcarpa_blast_results.tar?dl=0

Thanks,

Beatriz

DRL commented 8 years ago

Hi Beatriz,

the problem with the blast file is that the second column does not carry the taxid:

This could be because you are missing the taxonomy database from NCBI and therefor BLAST can not assign a taxid to the hits. Another possibility is that the taxdb database is not in the path defined by the BLASTDB environment variable.

cheers,

dom