Closed xingjianleng closed 2 years ago
@xingjianleng
You might combine the description into a single paragraph, e.g.,
Project Descriprion: The goal of this project is to develop efficient graph-based sequence alignment algorithms. Current sequence alignment algorithms don't scale well with long sequences, even when these sequences are closely related. We plan to investigate whether the de Bruijn graph model enables an efficient way to align closely related sequences. More specifically, we first build a de Bruijn graph from input sequences, covert the de Bruijn graph into the partial order graph, and finally output the sequence alignment. We will investigate how to decycle the de Bruijn graph and collapse its bubbles into local partial order graphs in a divide-and-conquer way.
Comments:
I endorse Yu Lin's remarks. (But you can still include it in the final report.)
@biolinyu @GavinHuttley Thanks for the comments!
Description:
Goals:
To provide a pairwise sequence alignment algorithm with lower time and memory complexity.
Problem to solve:
Current sequence alignment algorithms don't scale well when the input sequences are increasingly longer. One of the most popular sequence alignment software "MUSCLE" leads to Segmentation Fault when aligning two sequences with 30,000 lengths. So, I would like to experiment with whether there is an efficient way to align two closely related sequences.
Possible approaches of solutions:
Generally, we are converting the sequences into a de Bruijn graph then map to a partial order graph to do the alignment. We can first construct the de Bruijn graph for two input sequences. Then, we could use some decycling techniques to ensure the de Bruijn graph doesn't contain any circles. For the linear part of the de Bruijn graph, we can add these nodes to a partial order graph. For bubbles in the graph, we can call partial order alignment to get the solution to sub-problems and add them to the partial order graph.
Five keywords description:
Bioinformatics; Pairwise sequence alignment; de Bruijn graph, Partial order graph; Graph theory;
Milestones:
Semester 1:
Week 1 - 4: Literature review Week 5 - 7: Fast prototype implementation Week 8 - 9: Code refactoring & Adaptation to corner cases Week 10 - 12: Report introduction draft
Semester 2:
Week 1 - 3: Rewriting the algorithm with a more efficient programming language Week 4 - 8: Finishing the rest part of the report & preparing for presentation Week 9 - 11: Polishing the report and presentation