Closed jinnaiyuu closed 9 years ago
1 . read literature about MSA and its implementation on A* and DP.
components of MSA state representation: the sequence and a gap so far. heuristic: based on pairwise sequence alignment presented in Korf et al. 2005 edge cost: PAM 250 is a score martix for aminos. It corresponds to a edge cost in A* search. expand: Branches will be one of the aminos for k-strings or a gap. hash for closed list: take polynomial for now. not trivial to make it easy. hash for distribution: XORing the entire sequence. structured zobrist: XORing every three amino of the sequence.
state representation: n dimensional vector expand: each integer +1 or +0. therefore branching factor = 2^n - 1. (all 0 shouldn't happen)
This gonna be a bit tricky.
Input filetype: tfa, as it has the simplest structure to parse.
TODO: 1 . Implementation of heuristic.
2 . Zobrist Hash.
TODO: 1 . Implementation of heuristic. Done.
2 . Zobrist Hash. Done. Also implemented abstraction.
3 . Weighted A* for pruning.
TODO: 3 . Weighted A* for pruning. Done. May need to optimize the weight.
TODO: 4 . Optimize overall performance.
TODO: 5 . Recalculate node send ratio. 6 . Run A* search on instances used in Kobayashi et al.
TODO: 7 . Implement Hyperplane work distribution
Implement multiple sequence alignment for HDA* SZ.
milestones 1 . read literature about MSA and its implementation on A* and DP. 2 . Implement pairwise sequence alignment with blind heuristic. 3 . Implement heuristic for pairwise sequence alignment. 4 . Expand the domain to MSA. 5 . Implement hash distribution for MSA and run on HDA*. 6 . Build structured zobrist for MSA. 7 . Implement hyperplane work distribution for MSA.