This is about the sorting problem of the student's height. The number of students is small enough to fit in an annual photo. How cute is that! What a small problem it is! Any sorting algorithm will work fine.
Furthermore, the data set is already "almost" sorted because the students are asked to stand in a single line in non-decreasing order by height. This is the case where the bubble sort, known to be notorious for slow performance, can exceptionally work well. As you know, the lower bound of bubble sort is O(n) for the already sorted inputs and does not require auxiliary space which is great too. But, the upper bound is O(n**2). So, what comes to my mind is a counter sort. It performs linearly and asymptotically superior to others with the controlled data set.
Assumption
I can assume the inputs are positive integers based on the function definition and examples. If it were a real interview, I would reconfirm with the interviewer before making a decision to use counting sort because it is primarily effective only for small, positive integers and not suitable for decimal values with high precision.
The other things that I have to care about is the possibility of an outlier. Unfortunately, the counting sort is susceptible to performance issues when outliers are present in the data. However, the constraint showed that the number is simplified with integers with the controlled range, with no such outlier to become 100,000 or 1,000,000. Given constraints show that there is no outlier to increase the auxiliary space requirement for count sort.
Constraints:1 <= heights.length <= 100, n =100 (number of elements)
1 <= heights[i] <= 100, k=100 (range of input value)
Comparative sorting vs Counting sort:
To compare ordinary comparative sorting algorithms such as quick sort or merge sort with O(n log n), n log n = 100 * 6.644 ≈ 664.4. On the other hand, the upper bound of counting sort is k = 100, n = 100, which is k + n = 200.
Approach
The key idea of counter sort is to use an array.
Create an array. The index will be the value, and its value in the index is the count.
Based on that, generate expected array
Compare heights and expected
Solution
class Solution:
def heightChecker(self, heights: List[int]) -> int:
if not heights:
return 0
# O(k)
counter = [0]*(max(heights)+1)
# O(n)
for height in heights:
counter[height] += 1
# O(n+k)
expected = []
for i in range(len(counter)): # for i in range(1,101):
expected.extend([i]*counter[i])
# O(n)
cnt_wrong_order = 0
for i in range(len(heights)):
if heights[i] != expected[i]:
cnt_wrong_order += 1
return cnt_wrong_order
Performance
In this case, memory usage is not critical though, there is room for improving space usage. Using an array is an exemplary concept in a textbook to explain counting sort. However, it requires k length of space to pre-build a counter array. Depending on the data distribution, the space can remain unused after initialization or even can be sparse.
In case of space saving is concerned and it is ok not to be a stable sort - the initial order is maintained within the sorted order, there are two places to decrease the space.
Use a dictionary instead of an array
we can replace it with a dictionary. Using built-in python function, collections.Counter was also in consideration, but soon after I decided to not to use it. I wanted to track the minimum and maximum heights while I count the elements.
Remove the expected array
According to the problem description, we are allowed to use expected array. However, if it is a real interview, I would ask if this is really needed for other purpose or suggest not to use it. In a way that we count misplaced elements while iterating through the counter from the min_height to the max_height + 1.
from collections import defaultdict
class Solution:
def heightChecker(self, heights: List[int]) -> int:
if not heights:
return 0
counter = defaultdict(int) # key: height, value: cnt
min_height, max_height = 100, 0
for height in heights:
counter[height] += 1
if height > max_height:
max_height = height
elif height < min_height:
min_height = height
idx = 0
cnt_mismatch = 0
for i in range(min_height, max_height+1): # narrowed the scope of the range(1,101)
height, cnt = i, counter[i]
for _ in range(cnt):
if heights[idx] != height:
cnt_mismatch += 1
idx += 1
return cnt_mismatch
The same technique is applied to prevent `Memory Limit Exceeded` for another problem 900. RLE Iterator.
**See also:**
https://github.com/goungoun/leetcode/issues/37 https://github.com/goungoun/leetcode/blob/main/Array/rle_iterator.py
## Coding Style
If I have to write code for work, not for practice, I would not implement a sorting algorithm by myself spending time for debugging and testing. We have a built-in function for `sorted`.
```python3 []
class Solution:
def heightChecker(self, heights: List[int]) -> int:
expected = sorted(heights, reverse=False)
return sum(1 for i in range(len(heights)) if heights[i] != expected[i])
Even, one line of return code works so fine!
class Solution:
def heightChecker(self, heights: List[int]) -> int:
return sum(1 for x, y in zip(heights, sorted(heights)) if x!=y)
Complexity
Bubble Sort:
Time complexity: O(n) if sorted. Worst: O(n**2)
Space complexity: expected list O(n) = O(n)
Counting Sort:
Time complexity: O(n+k)
Space complexity: O(k), expected list O(n) is optional
Tim Sort (built-in):
For arrays of size 63 or less, Tim sort uses insertion sort.
Time complexity: worst O(n*logn), alreay sorted O(n)
Intuition
This is about the sorting problem of the student's height. The number of students is small enough to fit in an annual photo. How cute is that! What a small problem it is! Any sorting algorithm will work fine.
Furthermore, the data set is already "almost" sorted because the students are asked to stand in a single line in non-decreasing order by height. This is the case where the bubble sort, known to be notorious for slow performance, can exceptionally work well. As you know, the lower bound of bubble sort is O(n) for the already sorted inputs and does not require auxiliary space which is great too. But, the upper bound is O(n**2). So, what comes to my mind is a counter sort. It performs linearly and asymptotically superior to others with the controlled data set.
Assumption
I can assume the inputs are positive integers based on the function definition and examples. If it were a real interview, I would reconfirm with the interviewer before making a decision to use counting sort because it is primarily effective only for small, positive integers and not suitable for decimal values with high precision.
Example: heights = [1,1,4,2,1,3] heights = [5,1,2,3,4]
The other things that I have to care about is the possibility of an outlier. Unfortunately, the counting sort is susceptible to performance issues when outliers are present in the data. However, the constraint showed that the number is simplified with integers with the controlled range, with no such outlier to become 100,000 or 1,000,000. Given constraints show that there is no outlier to increase the auxiliary space requirement for count sort.
Constraints:
1 <= heights.length <= 100
, n =100 (number of elements)1 <= heights[i] <= 100
, k=100 (range of input value)Comparative sorting vs Counting sort: To compare ordinary comparative sorting algorithms such as quick sort or merge sort with O(n log n), n log n = 100 * 6.644 ≈ 664.4. On the other hand, the upper bound of counting sort is k = 100, n = 100, which is k + n = 200.
Approach
The key idea of counter sort is to use an array.
expected
arrayheights
andexpected
Solution
Performance
In this case, memory usage is not critical though, there is room for improving space usage. Using an array is an exemplary concept in a textbook to explain counting sort. However, it requires k length of space to pre-build a
counter
array. Depending on the data distribution, the space can remain unused after initialization or even can be sparse.In case of space saving is concerned and it is ok not to be a stable sort - the initial order is maintained within the sorted order, there are two places to decrease the space.
Use a dictionary instead of an array we can replace it with a dictionary. Using built-in python function,
collections.Counter
was also in consideration, but soon after I decided to not to use it. I wanted to track the minimum and maximum heights while I count the elements.Remove the expected array According to the problem description, we are allowed to use
expected
array. However, if it is a real interview, I would ask if this is really needed for other purpose or suggest not to use it. In a way that we count misplaced elements while iterating through the counter from the min_height to the max_height + 1.class Solution: def heightChecker(self, heights: List[int]) -> int: if not heights: return 0
Even, one line of return code works so fine!
Complexity
Bubble Sort:
Counting Sort:
Tim Sort (built-in): For arrays of size 63 or less, Tim sort uses insertion sort.
https://en.wikipedia.org/wiki/Timsort#cite_note-python_timsort-13
Conclusion
The problem is suitable for practicing different types of other algorithms. I've made all them for fun!
See also: https://github.com/goungoun/leetcode/blob/main/Sort/count_sort.py https://github.com/goungoun/leetcode/blob/main/Sort/bubble_sort.py https://github.com/goungoun/leetcode/blob/main/Sort/heap_sort.py https://github.com/goungoun/leetcode/blob/main/Sort/quick_sort.py https://github.com/goungoun/leetcode/blob/main/Sort/merge_sort.py