Closed qkrclrl701 closed 5 years ago
About your first question, this was also pointed out during the lecture, so the professor fixed it. Guess it was catta, gagtat, tagga -> cattagga, gagtat
Seems like the scribble hasn't been updated.
lecture-scribble.txt
is not at all meant to be read :) Yes, you're right that the example is wrong. I'll update it some time in the future.
Recall that "work" is the total amount of computations and "span" is the length of the longest dependency chain. Even in parallel overlap function, the work does not reduce to O(log min(|s|, |t|)). The parallelism is suggested by the fact that the span is not O(|s|*|t|), but O(log |s| + log |t|).
@jeehoonkang
For example, if we have two strings of length 2 and length 100, say s = ab and t = abab.....ab
Then, we have to check if 2 pairs of suffix/prefix match. (b, a) and (ab, ab). Each checking can be done in parallel, using reduce of logical and. We will check O(min(|s|, |t|)) number of suffix/prefix match, and their length is bounded by O(min(|s|, |t|)) , so it costs O(log min(|s|, |t|)) span. And then, we have to find the maximum length of matching suffix/prefix, and it can be done using reduce of maximum. There are O(min(|s|, |t|)) pairs, so maximum costs O(log min(|s|, |t|)) span.
Could you please explain in detail, why is has O(log |s| + log |t|) span? As I remember, the algorithm is the same as what is done in the class, and I don't understand why it has such span..
Yeah, as you said, a more precise analysis gives you O(log min(|s|, |t|)) span.
In the lecture-scribble.txt, there's an example of solve SS problem using greedy algorithm. I think the line catta, gagtat, tagga -> catta, gagtatagga Is wrong. We can have ovelap of 2 by merging tagga, gagtat in order to get catta, taggagtat. I think it is obvious, but the textbook have a same process, so I'm confused.
When we calculate max overlap(s, t), textbook says it costs O(|s||t|) work and O(log|s|+log|t|) span. It says for non-parallel case, we calculate the overlap by brute force, then it is right, but O(min{|s|, |t|}^2) is more accurate. For the parallel case, by parallelly checking if n-length preffix of s and n-length suffix of t by reduce of logical and. Then, I think it is also costs O(log(min{|s|, |t|})). Is my explanation wrong, or the textbook uses weaker but simple upper bound to calculate overall cost easily?