walkccc / CLRS

📚 Solutions to Introduction to Algorithms Third Edition
https://walkccc.me/CLRS
MIT License
4.65k stars 1.26k forks source link

Problem 8-7 #367

Open Xeuva opened 3 years ago

Xeuva commented 3 years ago

I come up a solution.

a. let A_sorted be the array which A is sorted to using algorithm X, because

Let A[p] be the smallest value in A that algorithm X puts into the wrong location, and let A[q] be the value that algorithm X moves to the location into which A[p] should have gone.

let i be the index of the right location where A[p] should go into.
A_sorted[i] should be A[p],
but X failed at A[p], thus
A_sorted[i] = A[q] != A[p]
And because A[p] is the smallest value where X failed,
so all A_sorted[1...i-1] is in right location, and all elements is smaller than A[p], thus
A_sorted[i] = A[q] > A[p].

b. Sorting B is the same as sorting A.
Let i, j be indices, and without loss of generality i < j, A[i] > A[j], there are 3 cases when compare-exchange happened.
if A[i] <= A[p], and A[j] <= A[p], B[i] = B[j] = 0, swap A is like swap B, because B do not change
if A[i] > A[p], and A[j] > A[p], B[i] = B[j] = 1, swap A is like swap B, because B do not change
if A[i] > A[p], and A[j] <= A[p], B[i] = 1, B[j] = 0. Swap is needed in A and B at same indices, swap A is like swap B.

c. Whether algorithm used in odd steps is oblivious, it produces same result. Even steps is independent with values in rectangular array. So overall we can treat columnsort as oblivious.

d. Let us denote D be the maximum difference of number of 0s between columns.

  1. After step3, because rectangular array is only consist of 0s(at top) or 1s(at bottom), The number of dirty rows between all 0s rows and all 1s rows is equal to D. For example, in all columns, c[i] has minimum number of 0s n_i, and c[j] has maximum number of 0s n_j. The number of dirty rows is D = n_j - n_i.
  2. After step 1, in all columns, 0s goes up and 1s goes down. In every column, we can group elements of length s without overlapping, and then each group is then became row in order in step 2. In every column, there is one and only one change from 0 to 1 in some group. When a group of all 0s or all 1s become row in step 2, the difference of number of 0s between columns do not change. When a group of mixed 0s and 1s become row, the difference may increase by one. Before step2, there are s columns, each column only have one group where has mixed 0s and 1s, thus after step2, contributes at most s to the difference D. So D is at most s.

e. After step 4, every row(from left to right) become a group in column(from up to down). If row is all 0s or all 1s, then it become a clean area in column-major order after step 4. If row is mixed 0s and 1s, it become dirty area after step 4. From question d, we know only there are at most s dirty rows in sequence, each row has s elements, at most totally s^2 elements, which then become a dirty area of at most s^2 elements after step 4.

f. Steps 5–8 is the same as sorting overlapping grouping all elements in column-major order of length r by step r/2. That is, first sort column 1(length r), and then bottom of column 1 and top of column 2 in column-major order (length r), and then column 2, and so on. After step 4, only at most s^2 elements in sequence is not sorted. Because r >= 2s^2, the dirty sequence is belong to one of the groups described as above. Steps 5-8 sorts all these groups, and thus sorts the dirty sequence.

g. Divide rows into two upper and lower parts. Upper part has r' rows, s can divide r', r'= floor(r/s) s. Lower part has r%s rows, meaning at most s-1 rows. After steps 1–3, the dirty rows is from two parts. Using the same analysis in question d, upper part contributes at most s rows, lower part contributes at most s-1 rows, totally 2s-1 rows. Each dirty row has s elements. Using same anaylsis in question e and f, we need to sort (2s-1)s elements in steps 5-8, which implies that r >= 2(2s-1)*s.

h. Padding all 0s or all 1s (more generally padding -inf or +inf when sorting any numbers) in the bottom before step 1.