Closed LalitNM closed 2 years ago
Hi @LalitNM, thank you for trying Magic-BLAST and I am sorry you ran into problems. Do you use the same BLAST database in both runs? It is unexpected that you get a continuous alignment in your second run. Are you able to share the cDNA and genomic sequences? Is there a way that I can have a look at them? It would make it easier for me to narrow down the problem.
Thank you for your response. Yes, I used same BLAST database in both runs. cDNA and genomic sequences are here -
cDNA:
>Zm00001eb178650_T001
GAGGTGGCACGCCGCACGTGCCACGGCCTCTTGTCCTGCGGTCGAGCGAAATCGATCTACCCCAGGTGCGGCGGGCGCGG
CTTTATTCTCAAGATTTTTTCTGTGTTCTTTAGAGAGCAGCCATGGACGAGTGGATTCGCCAAACCGAGGTGTGGGTGCA
CCAAACCGACAGCTGGATCCGGCAGCAGCCCCCGGAGCAGATCTATGTAGCGGCCGCGGTGGTTGCCGTCACAATCCTGT
TGCTTATCGTAGCATCTTGTCTGAAATCGTCTAGACCAAACACCATAGTTCTATCTGGGCTTAGTGGCAGTGGCAAAACC
ACTATTTACTACCAGCTTCGGGATGGATCATCACATCAAGGAACTGTTACTTCAATGGAAGAAAACAGTGATACTTTTGT
GCTGCATTCGGAGCAAGAAAGGAAAGACAAAGTAAAACCTGTGCATATTATTGATGTTCCTGGTCATGCTAGGCTCAAAC
CCAAACTTGATGAAGTCCTGCCTAAAGCTGCTGCGGTTGTCTTTGTTGTTGATGCTCAAGATTTCTTGTCTAGCATGCAA
GCTGCTGCAGAGTACCTTTATGACATTCTGACAAAGGCAACTGTAGTGAAGAAAAAGGTCCCTGTGCTTATATTCTGCAA
CAAGACAGATAAAGTTACCGCCCACTCGAAGGAATTCATCAAGAAGCAGTTGGAGAAAGAATTGAACAAGCTTAGGGAAT
CTAGGAACGCCGTATCGTCAGCCGACATCTCTGATGAAGTCCAGCTTGGGGTCCCTGGAGAGGCATTCAACTTCAGCCAG
TGCCAGAACAAGGTGGCTGTTGCGGAGGGTGCTGGTTTGACAGGTGATGTTTCGGCCGTGGAACAATTCATCCGTGAACA
TGTGAAGGCATAGGTTGTTCTTATTGTTTTTGGGGCTCCAAATGGCTATCCTGATGAGATAAGATGTACATTGCGTCAAT
GTGGTTGTTACCTTTCATGCATGTACACCTTCATCCGGTGCTCTGAAGTTGCTTAGAGACAGTGAAGCACAGTCATATTT
TGAACTGGTTCAGCTGACCTTGTTATAGAGGCATTGCATGTTGCACTGCTTATGCCTGACAGTCATCGGATATGAACAGA
GCATGTTGAAACTGACTGGCTTTGGTTGTTTTCCACTTTTGCTCCATAAGTTTTACTAGCAAAAATGCCTGTGCTTTGCA
ACGGGTTTCAAATAGATGATTTCATGACAATAAATTTCTCAGTGAAGCAAAACAAACAGAAAATGTCATGCTGACAGATT
TTAATTTGTTCTCATGT
>Zm00001eb178650_T002
ATACTTTCCCACTGCCGCAGTTCGCCACTCTCGCCATCGGGGACGAGGGACTGGGAGCCAGGATACTTCTGCCGCGCGGC
GACAGAGCAGCCATGGACGAGTGGATTCGCCAAACCGAGGTGTGGGTGCACCAAACCGACAGCTGGATCCGGCAGCAGCC
CCCGGAGCAGATCTATGTAGCGGCCGCGGTGGTTGCCGTCACAATCCTGTTGCTTATCGTAGCATCTTGTCTGAAATCGT
CTAGACCAAACACCATAGTTCTATCTGGGCTTAGTGGCAGTGGCAAAACCACTATTTACTACCAGCTTCGGGATGGATCA
TCACATCAAGGAACTGTTACTTCAATGGAAGAAAACAGTGATACTTTTGTGCTGCATTCGGAGCAAGAAAGGAAAGACAA
AGTAAAACCTGTGCATATTATTGATGTTCCTGGTCATGCTAGGCTCAAACCCAAACTTGATGAAGTCCTGCCTAAAGCTG
CTGCGGTTGTCTTTGTTGTTGATGCTCAAGATTTCTTGTCTAGCATGCAAGCTGCTGCAGAGTACCTTTATGACATTCTG
ACAAAGGCAACTGTAGTGAAGAAAAAGGTCCCTGTGCTTATATTCTGCAACAAGACAGATAAAGTTACCGCCCACTCGAA
GGAATTCATCAAGAAGCAGTTGGAGAAAGAATTGAACAAGCTTAGGGAATCTAGGAACGCCGTATCGTCAGCCGACATCT
CTGATGAAGTCCAGCTTGGGGTCCCTGGAGAGGCATTCAACTTCAGCCAGTGCCAGAACAAGGTGGCTGTTGCGGAGGGT
GCTGGTTTGACAGGTGATGTTTCGGCCGTGGAACAATTCATCCGTGAACATGTGAAGGCATAGGTTGTTCTTATTGTTTT
TGGGGCTCCAAATGGCTATCCTGATGAGATAAGATGTACATTGCGTCAATGTGGTTGTTACCTTTCATGCATGTACACCT
TCATCCGGTGCTCTGAAGTTGCTTAGAGACAGTGAAGCACAGTCATATTTTGAACTGGTTCAGCTGACCTTGTTATAGAG
GCATTGCATGTTGCACTGCTTATGCCTGACAGTCATCGGATATGAACAGAGCATGTTGAAACTGACTGGCTTTGGTTGTT
TTCCACTTTTGCTCCATAAGTTTTACTAGCAAAAATGCCTGTGCTTTGCAACGGGTTTCAAATAGATGATTTCATGACAA
TAAATTTCTCAGTGAAGCAAAACAAACAGAAAATGTCATGCTGACAGATTTTAATTTGTTCTCATGT
>Zm00001eb178650_T003
TCCCACTGCCGCAGTTCGCCACTCTCGCCATCGGGGACGAGGGACTGGGAGCCAGGATACTTCTGCCGCGCGGCGACCGG
TCGAGCGAAATCGATCTACCCCAGGTGCGGCGGGCGCGGCTTTATTCTCAAGATTTTTTCTGTGTTCTTTAGAGAGCAGC
CATGGACGAGTGGATTCGCCAAACCGAGGTGTGGGTGCACCAAACCGACAGCTGGATCCGGCAGCAGCCCCCGGAGCAGA
TCTATGTAGCGGCCGCGGTGGTTGCCGTCACAATCCTGTTGCTTATCGTAGCATCTTGTCTGAAATCGTCTAGACCAAAC
ACCATAGTTCTATCTGGGCTTAGTGGCAGTGGCAAAACCACTATTTACTACCAGCTTCGGGATGGATCATCACATCAAGG
AACTGTTACTTCAATGGAAGAAAACAGTGATACTTTTGTGCTGCATTCGGAGCAAGAAAGGAAAGACAAAGTAAAACCTG
TGCATATTATTGATGTTCCTGGTCATGCTAGGCTCAAACCCAAACTTGATGAAGTCCTGCCTAAAGCTGCTGCGGTTGTC
TTTGTTGTTGATGCTCAAGATTTCTTGTCTAGCATGCAAGCTGCTGCAGAGTACCTTTATGACATTCTGACAAAGGCAAC
TGTAGTGAAGAAAAAGGTCCCTGTGCTTATATTCTGCAACAAGACAGATAAAGTTACCGCCCACTCGAAGGAATTCATCA
AGAAGCAGTTGGAGAAAGAATTGAACAAGCTTAGGGAATCTAGGAACGCCGTATCGTCAGCCGACATCTCTGATGAAGTC
CAGCTTGGGGTCCCTGGAGAGGCATTCAACTTCAGCCAGTGCCAGAACAAGGTGGCTGTTGCGGAGGGTGCTGGTTTGAC
AGGTGATGTTTCGGCCGTGGAACAATTCATCCGTGAACATGTGAAGGCATAGGTTGTTCTTATTGTTTTTGGGGCTCCAA
ATGGCTATCCTGATGAGATAAGATGTACATTGCGTCAATGTGGTTGTTACCTTTCATGCATGTACACCTTCATCCGGTGC
TCTGAAGTTGCTTAGAGACAGTGAAGCACAGTCATATTTTGAACTGGTTCAGCTGACCTTGTTATAGAGGCATTGCATGT
TGCACTGCTTATGCCTGACAGTCATCGGATATGAACAGAGCATGTTGAAACTGACTGGCTTTGGTTGTTTTCCACTTTTG
CTCCATAAGTTTTACTAGCAAAAATGCCTGTGCTTTGCAACGGGTTTCAAATAGATGATTTCATGACAATAAATTTCTCA
GTGAAGCAAAACAAACAGAAAATGTCATGCTGACAGATTTTAATTTGTTCTCATGT
Genomic Sequence:
>Zm00001eb178650
CATGAGAACAAATTAAAATCTGTCAGCATGACATTTTCTGTTTGTTTTGCTTCACTGAGAAATTTATTGTCATGAAATCA
TCTATTTGAAACCCGTTGCAAAGCACAGGCATTTTTGCTAGTAAAACTTATGGAGCAAAAGTGGAAAACAACCAAAGCCA
GTCAGTTTCAACATGCTCTGTTCATATCCGATGACTGTCAGGCATAAGCAGTGCAACATGCAATGCCTCTATAACAAGGT
CAGCTGAACCAGTTCAAAATATGACTGTGCTTCACTGTCTCTAAGCAACTTCAGAGCACCGGATGAAGGTGTACATGCAT
GAAAGGTAACAACCACATTGACGCAATGTACATCTTATCTCATCAGGATAGCCATTTGGAGCCCCAAAAACAATAAGAAC
AACCTATGCCTTCACATGTTCACGGATGAATTGTTCCACGGCCGAAACATCACCTGTCAAACCAGCACCCTCCGCAACAG
CCACCTTGTTCTGGCACTGGCTGAAGTTGAATGCCTCTCCAGGGACCCCAAGCTGGACTTCATCAGAGATGTCGGCTGAC
GATACGGCGTTCCTAGATTCCCTAAGCTTGTTCCTACAACATGGTCAAAAACAGAGGGGCACAAGAAACCTTGTGAATAT
GATGTTTCACACAGTTGGACTAGCAAACATGTTTTCTGCATGTTGCCTTAAAGGTGGTGGTTATGGGTTACTATAATGGG
ACCATGGATGTGCGTTGGATTCAAAAAAAAGGGGGATAAACGAGAATTACAACTACTCCCAAACCAAAATAAATTATATT
TATTGGTGCAGCATAATATCTGGAAATATTCCAAATGCACATCAGATCTAGATAGAAATCACATGACCACAGCATTTATG
CAGCATGCACTACTGAGTTGAGTATTTCTAATTCTAATTATGCAAGTACGATAAATTTTGGGTCTTGGAGAGCTGTCCTT
ACAATTCTTTCTCCAACTGCTTCTTGATGAATTCCTTCGAGTGGGCGGTAACTTTATCTGTCTTGTTGCAGAATATAAGC
ACAGGGACCTTTTTCTTCACTACAGTTGCCTTTGTCAGAATGTCATAAAGGTACCTGCAAAAAGATGATAATATTGTTAT
GAAATGTTAACAACGTAAGAAATGATAAAACAGAGTAATGCTCTACAATGTTGGGAATAGTGGAGAAGGTTCCTATTCAC
ACAAATGATAATATAGGAAATTATTCAAACAGAATTGTTTCTGCAATCTCGGAATATGAAAAGGGCTCCTAATTACACTG
TTAACTTAAACTTAGCACAAAGGAAAGTGAAAGTAGATGAAATAGAAATAGACACCAAATGCATTTGGTAAAAATTAAGG
AAGGGGAGGGGCAAAATTTACTCTGCAGCAGCTTGCATGCTAGACAAGAAATCTTGAGCATCAACAACAAAGACAACCGC
AGCAGCTTTAGGCAGGACTTCATCAAGTTTGGGTTTGAGCCTAGCATGACCAGGAACATCAATAATATGCACAGGTTTTA
CTTTGTCTTTCTGCAGGGAAGGACAACAGGAAGGTTTAGCAGAAATATCGCAGTCAGGATATGCATGTTTTTTCATAAGA
AACAAAACTATCTACATTACTTTTCTCTCAATATGTTGGAAAAAGAATGAGCTCTAGAAGTTTGCATACCCTTTCTTGCT
CCGAATGCAGCACAAAAGTATCACTGTTTTCTTCCATTGAAGTAACAGTTCCTTGATGTGATGATCCATCCCGAAGCTGA
CAAGCACAAAAGAATTAGGAGAGCAACAAAAGATAGTACAAGATAACCCACCTACTTGTAAGAATTGGTATTGATAAAAA
TGCCAAATGATCTAATAGTAGACAAAAGGGGGGAAATATTCTGGTTAGTCAACTAAAACAAATCAACTAGTTTATATGGA
AAAAGCGAAGTCAATCACTCAATCTGAATAAGAAAGATATGTGTAAGCATGATAACAAGCAGAACTGATATCCGTCTAGA
ATGCTCATCCCACACCCTAAATAGAATCGAAAAATGAGGTTCAGATATTATTTTTGCAAACTAGATGCTACAACCTTGTC
CTTAACTTGTAAAAGCACAGACCAAAAAAGTTGGAAGTTACCTGGTAGTAAATAGTGGTTTTGCCACTGCCACTAAGCCC
AGATAGAACTATGGTGTTTGGTCTAGACGATTTCAGACAAGATGCTGCAGAAAAAAAACACAAAAAGGAATTGCTTATCA
TTTACCACTCTAATATAATGAATTGTATCATAGAGCAGACATCTTAACAAGCAACACATCGGCACCCCATGTATTGAATC
ATAAGGTATCTCCATTAGGTGGAGCCAGGACCTAGGATATCCAGGCTCCAGCCCTGGTTTTAGGCCTAAAATTACATGGA
AGACAACTTAAGAAACTGCACAATATTGAAACTAAAATGAATTGCAGCATGTTTTAACTAGCTCAGCCCTGGGCTCTAAC
CCAATCCTGGCTCCGCCCCTGGGTATCTATGAATTTAAAATAGAAAGGAAAACAAACACCGTCTTCCCTTCACAATACTA
AACATACCATCATCGGATACATCATGCTCAAGAACTCGAAAATGTTATATCATCTCAGAGCTGCAACAATTTACAATGCA
CAATAACTAATAATGTAATAAAAACATAGGCCTAATCTTGGGTTATTCCATCAATAGCACACCCATCATTACTACCCAGT
TGGCAATTCCGAAACTTGTAACACAATGACACATAAGAGTCTGAATGAACCAAAGCTGAAAAATTCCAGATCCACAACTA
GTTTATTCTCCAGTGATGAACCAACCAACCCCTCTGATGCTTTTAGGTGACAATGCCCACCTTGTGGATGTTCGCCACAC
AACATTCAAAGACTCTCAAGACCATGGGCACTCGATACAGAGCAAAAGGATCTTAGGTTTCGCACCGCACCTACGATAAG
CAACAGGATTGTGACGGCAACCACCGCGGCCGCTACATAGATCTGCTCCGGGGGCTGCTGCCGGATCCAGCTGTCGGTTT
GGTGCACCCACACCTCGGTTTGGCGAATCCACTCGTCCATGGCTGCTCTCTAAAGAACACAGAAAAAATCTTGAGAATAA
AGCCGCGCCCGCCGCACCTGGGGTAGATCGATTTCGCTCGACCGCTGCATAATAAGAAACCAAATCAAGCAACCGATGGA
CCGGGTAAGTACCGCAACCTGCAAAATGCAGTCCCCCCCCCCCCCCCCACACACACACACACACACACCACCACCAAGCA
AGGTGGGGTCTCCTACCACGGATACCAAAATTAGATCTAATACGAGGTACACGATAAGCAGAGGGTGGCGGGGAGAGCAA
GAGGAGGAGGAGGGCTACGTCGCCGCGCGGCAGAAGTATCCTGGCTCCCAGTCCCTCGTCCCCGATGGCGAGAGTGGCGA
ACTGCGGCAGTGGGAAAGTATAGCTGCTAGGGAGATGGACGGACCAGGACAAGAGGCCGTGGCACGTGCGGCGTGCCACC
TC
Thank you for your sequences. Magic-BLAST produces spliced alignments for all exons with your cDNA sequences. In SAM and BAM files all axons are shown in a single line. You can verify that these are spliced alignments by looking at the CIGAR string (6th column in the SAM file), for example: 1S593M369N132M287N149M139N87M365N83M746N214M214N77M
, where "M" represents exons and "N" represents introns.
I am not too familiar with bam2bed
, but according to documentation in https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/bam2bed.html you should use --split
option for RNA-seq alignments. It looks like bam2bed
without this option creates a single BED entry for the whole spliced alignment. This is why you see a single alignment in your graph. bam2bed --split
creates a BED entry for each exon. Please, try this:
magicblast -query cdna.fa -db blastdb | samtools view -bS | samtools sort -O BAM | bam2bed --split > aln-mb.bed
Thank you for helping, though I could not produce desired .bed
file, but I solved my problem by visualising the .sam
file directly. Seems that the bam2bed
command does not work properly, It leaves the CIGAR string unaffected in .bed
format, which should not be the case. Considering that issue is with bam2bed
command, I am closing this issue here. Here is the final screenshot of the allignment with .sam
file.
Hi, I tried magic-blast to align a toy cDNA to a toy genome, I could either produce alignment of whole sequence (in blue) or the first exon only (in red).
Commands used to produce results
I want to know is there a way I can produce the expected alignment in a .bed file using magic-blast?
PS: I investigated respective .sam files as well, but could not find the information of all exons and introns there.