patrickwest / EukRep

Classification of Eukaryotic and Prokaryotic sequences from metagenomic datasets
MIT License
66 stars 12 forks source link

[Question] How does it handle organisms that have undergone secondary symbiosis? (e.g. diatoms) #5

Open jolespin opened 4 years ago

jolespin commented 4 years ago

I just found out about your tool via convo on the DAS_Tool GitHub. Looking forward to using it on my dataset which is an ocean metagenome that has several cyanobacteria, diatoms, and dinoflagellates.

How well does it handle complex symbiotic events where bacterial genes have been incorporated in the host genome?

I tested it on a well established diatom Phaeodactylum tricornutum and the results make sense. The chloroplast was returned and so were some bottom drawer scaffolds that did not fit in any of the chromosomes. I guess this introduces 2 questions:

(1) Would you expect this to separate the contigs containing eukaryotic chloroplasts into the prokarya file in a metagenome? (2) Do you think these bd_xxxx supercontigs could actually bacterial in origin and potential contamination in the assembly?

(µ_env) Joshs-MacBook-Pro:Downloaded_Music mu$ EukRep -i /Users/mu/Downloads/Phaeodactylum_tricornutum.ASM15095v2.dna.toplevel.fa -o ~/Desktop/euk.o --prokarya ~/Desktop/prok.o
(µ_env) Joshs-MacBook-Pro:Downloaded_Music mu$ grep ">" ~/Desktop/prok.o
>chloroplast dna:chromosome chromosome:ASM15095v2:chloroplast:1:117369:1 REF
>bd_44x36 dna:supercontig supercontig:ASM15095v2_bd:bd_44x36:1:4734:1 REF
>bd_49x36 dna:supercontig supercontig:ASM15095v2_bd:bd_49x36:1:3009:1 REF
>bd_50x36 dna:supercontig supercontig:ASM15095v2_bd:bd_50x36:1:2663:1 REF
>bd_3x5 dna:supercontig supercontig:ASM15095v2_bd:bd_3x5:1:2590:1 REF
>bd_37x35 dna:supercontig supercontig:ASM15095v2_bd:bd_37x35:1:2489:1 REF
>bd_51x36 dna:supercontig supercontig:ASM15095v2_bd:bd_51x36:1:2113:1 REF
>bd_52x36 dna:supercontig supercontig:ASM15095v2_bd:bd_52x36:1:2040:1 REF
>bd_53x36 dna:supercontig supercontig:ASM15095v2_bd:bd_53x36:1:2027:1 REF
>bd_16x14 dna:supercontig supercontig:ASM15095v2_bd:bd_16x14:1:1992:1 REF
>bd_54x36 dna:supercontig supercontig:ASM15095v2_bd:bd_54x36:1:1351:1 REF
>bd_55x36 dna:supercontig supercontig:ASM15095v2_bd:bd_55x36:1:808:1 REF
>bd_36x35 dna:supercontig supercontig:ASM15095v2_bd:bd_36x35:1:554:1 REF
>bd_7x3 dna:supercontig supercontig:ASM15095v2_bd:bd_7x3:1:450:1 REF
>bd_6x3 dna:supercontig supercontig:ASM15095v2_bd:bd_6x3:1:450:1 REF
jolespin commented 4 years ago

Also, I just ran it on human:

(µ_env) Joshs-MacBook-Pro:Downloaded_Music mu$ EukRep -i /Users/mu/Downloads/GCA_000001405.28_GRCh38.p13_genomic.fna -o euk.txt --prokarya prok.txt
(µ_env) Joshs-MacBook-Pro:Downloaded_Music mu$ grep ">" prok.txt
>KI270715.1 Homo sapiens chromosome 2 unlocalized genomic contig, GRCh38 reference primary assembly
>KI270716.1 Homo sapiens chromosome 2 unlocalized genomic contig, GRCh38 reference primary assembly
>GL000225.1 Homo sapiens chromosome 14 unlocalized genomic contig, GRCh38 reference primary assembly
>KI270730.1 Homo sapiens chromosome 17 unlocalized genomic contig, GRCh38 reference primary assembly
>KI270302.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270304.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270303.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270305.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270310.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270316.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270315.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270312.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270311.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270412.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270411.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270414.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270419.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270418.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270420.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270424.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270417.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270422.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270423.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270425.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270429.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270466.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270465.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270438.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270510.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270509.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270518.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270508.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270516.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270529.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270528.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270530.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270539.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270544.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270548.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270583.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270587.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270580.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270330.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270329.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270334.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270333.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270335.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270338.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270340.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270336.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270337.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270363.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270364.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270378.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270379.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270389.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270390.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270387.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270395.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270396.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270388.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270394.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270386.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270391.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270383.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270393.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270384.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270392.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270381.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270385.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270376.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270374.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270372.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270373.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270375.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270371.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270741.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KI270756.1 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>GL000216.2 Homo sapiens unplaced genomic contig, GRCh38 reference primary assembly
>KZ208904.1 Homo sapiens chromosome 1 genomic contig HSCHR1_8_CTG3, GRC reference assembly NOVEL PATCH for GRCh38
>ML143350.1 Homo sapiens chromosome 5 genomic contig HG1395_PATCH, GRC reference assembly FIX PATCH for GRCh38
>KN196481.1 Homo sapiens chromosome 11 genomic contig HG2217_PATCH, GRC reference assembly FIX PATCH for GRCh38
>KZ208917.1 Homo sapiens chromosome 12 genomic contig HG2047_PATCH, GRC reference assembly FIX PATCH for GRCh38
>KZ208919.1 Homo sapiens chromosome 14 genomic contig HSCHR14_8_CTG1, GRC reference assembly NOVEL PATCH for GRCh38
>ML143374.1 Homo sapiens chromosome 17 genomic contig HG2087_PATCH, GRC reference assembly FIX PATCH for GRCh38
>KI270854.1 Homo sapiens chromosome 16 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_1
>KI270856.1 Homo sapiens chromosome 16 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_1
>GL383557.1 Homo sapiens chromosome 16 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_1
>KI270861.1 Homo sapiens chromosome 17 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_1
>KI270866.1 Homo sapiens chromosome 19 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_1
>GL383581.2 Homo sapiens chromosome 21 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_1
>GL949750.2 Homo sapiens chromosome 19 genomic contig, GRCh38 reference assembly alternate locus group ALT_REF_LOCI_5
>J01415.2 Homo sapiens mitochondrion, complete genome