sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
467 stars 79 forks source link

Cannot use sbt.json file (sourmash search) #542

Closed EileenNeelie closed 4 years ago

ctb commented 6 years ago

Hi eileen, did you figure it out?? :) let us know if we can help.

EileenNeelie commented 6 years ago

Hi Titus,yes I solved the problem. :-) Thank you! You really do some great work! Best, Eileen 

ctb commented 5 years ago

On Fri, Sep 14, 2018 at 08:15:05PM +0000, EileenNeelie wrote:

Hi Titus,yes I solved the problem. :-) Thank you!You really do some great work!Best, Eileen 

great, glad to hear it!

EileenNeelie commented 5 years ago

Hi Titus,

I have another problem and I am not really sure what I can do:

But I think I should give you a bit more background:

We did some Stabel-Isotope Probing (SIP) experiments and labeled bacteria capable of degrading a specific substrate. Then we did metagenomic sequencing and amplicon sequencing (16S rRNA) to identify what bacteria are involved in the degradation and to look into possible pathways. I assembled and binned the metagenomes, checked there quality and annotated then a few of the Metagenome assembled genomes. One of them seemed really interesting for our study, so I wanted to use sourmash to rapidly compare this MAG against some metagenomes, just to get an idea of if its present before doing mapping to get abundance. Now I choose 3 metagenomes and in additions the sig files from the Tara Oceans (Delmont), however, I did not get any hits at all.

To be on the safe side, I thought I also run it against closely related genomes (Desulfuromonas and Thermincola) since this should give me some "hits" and could be used as a "control". Now my problem: Even when using 'sourmash search' against the "control" genomes, I do not get any similarity matches between my MAG and those genomes and I think I should get some matches.

Do you have an idea what went wrong or what could be the problem?

To give you an idea what I did, I attached a file for you summarizing what I did.

I also attached the bin of interest for you (contigs and sig file) just in case there might be a problem with that?

Thank you!

Best wishes,

Eileen

Dr. Eileen Kröber Postdoctoral Researcher

Research Area 1 “Landscape Functioning“

WG "Microbial Biogeochemistry" Leibniz Centre for Agricultural Landscape Research (ZALF) Eberswalder Str. 84 15374 Müncheberg

Germany

Tel.: +49 / 33432 / 82-4085 Fax: +49 / 33432 / 82-343

e-mail: Eileen.Kroeber@zalf.de mailto:Eileen.Kroeber@zalf.de

Scientific Director: Prof. Dr. Frank Ewert Administrative Director: Cornelia Rosenberg Court at which the institute is registered as VR 35 35 FF: Amtsgericht Frankfurt/Oder VAT-ID: DE811417184

"C. Titus Brown" notifications@github.com hat am 14. September 2018 um 14:50 geschrieben:

Hi eileen, did you figure it out?? :) let us know if we can help.

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/dib-lab/sourmash/issues/542#issuecomment-421348357 , or mute the thread https://github.com/notifications/unsubscribe-auth/AoqgJaD_qKQOGJwNaDktqzBpOZEE5Lx8ks5ua6YHgaJpZM4WpBJI .

--> Make signature from 13C_Heavy1_Bin4 (MAG) sourmash compute --scaled 10000 -k 31 /Users/eileen/Desktop/sourmash/data/13C_Heavy_1_Bin4.fa -o /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy1_Bin4.sig

--> Make signature from metagenomic reads (Marine metagenome 1) Sourmash compute --scaled 10000 /Users/eileen/Desktop/sourmash/data/Metagenomes/1/3300025895.a.fna.gz -o /Users/eileen/Desktop/sourmash/sourmash/marine_metagenome_1.sig -k 31

--> Evaluate containment, that is, what fraction of the read content (Marine metagenome 1) is contained in the ref genome (MAG - Bin4). sourmash search -k 31 /Users/eileen/Desktop/sourmash/sourmash/marine_metagenome_1.sig /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy1_Bin4.sig --containment

-->Result: 0 matches: similarity match


--> Try reverse: sourmash search -k 31 /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy1_Bin4.sig /Users/eileen/Desktop/sourmash/sourmash/marine_metagenome_1.sig --containment

-->Result: 0 matches: similarity match


--> Make signature from metagenomic reads (Marine metagenome 2) see above

--> Evaluate containment, that is, what fraction of the read content (Marine metagenome 1) is contained in the ref genome (MAG - Bin4). sourmash search -k 31 /Users/eileen/Desktop/sourmash/sourmash/marine_metagenome_2.sig /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy1_Bin4.sig --containment

-->Result: 0 matches: similarity match


--> Try reverse: sourmash search -k 31 /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy1_Bin4.sig /Users/eileen/Desktop/sourmash/sourmash/marine_metagenome_2.sig --containment

-->Result: 0 matches: similarity match


--> Make signature from metagenomic reads (Marine metagenome 3) see above

--> Evaluate containment, that is, what fraction of the read content (Marine metagenome 1) is contained in the ref genome (MAG - Bin4). sourmash search -k 31 /Users/eileen/Desktop/sourmash/sourmash/marineSedimentMetagenome_3.sig /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy1_Bin4.sig --containment

-->Result: 0 matches: similarity match


—> Run all again with —scaled 1,000 instead of 10,000, but again no hits.

-> Run against Desulfuromonas acetoxidans and Thermincola potents genome as a control:

0 matches: similarity match


-->Try to run against genome bins from Tara Oceans (Delmont et al 2017)

-->Make indexed db from Tara ocean bins

-->Search sourmash search /Users/eileen/Desktop/sourmash/sourmash/13C_Heavy_1_Bin4_scaled2000.sig /Users/eileen/Desktop/sourmash/data/DelmontTaraOceanDb.sbt.json -n 20

-->Results 0 matches: similarity match


ctb commented 5 years ago

hi eileen! the attachments didn't make it through to me - could you send them to me at titus@idyll.org? thank you!

EileenNeelie commented 5 years ago

Hi Titus, did you had a chance to look into this?

ctb commented 4 years ago

Closing for now; I ignored it for too long :(.