emory-courses / dsa-java

Data Structures and Algorithms in Java
https://emory.gitbook.io/dsa-java/
42 stars 55 forks source link

Knuth Shell Sort Gaps #54

Closed lcunild closed 4 years ago

lcunild commented 4 years ago

Comparison-based Sort

Why is n/3 the largest gap size to be used for the Knuth sequence, where n is the size of the list to be sorted? Why would it not be the largest number in the Knuth sequence which is smaller than n?

lujiaying commented 4 years ago

This is a very good question.

The choice of gap sequence in Shell Sort is critical and difficult. Different gap sequences lead to different performance when dealing with real-world arrays that need to be sorted.

In Wikipedia,

With respect to the average number of comparisons, Ciura's sequence[15] has the best known performance; gaps from 701 were not determined but the sequence can be further extended according to the recursive formula {\displaystyle h{k}=\lfloor 2.25h{k-1}\rfloor }hk = \lfloor 2.25 h{k-1} \rfloor.

[15] Ciura, Marcin (2001). "Best Increments for the Average Case of Shellsort" (PDF). In Freiwalds, Rusins (ed.). Proceedings of the 13th International Symposium on Fundamentals of Computation Theory. London: Springer-Verlag. pp. 106–117. ISBN 978-3-540-42487-1.

Back to Knuth's Sequence, when it was developed by Donald Knuth, he formulated it as that way (I am not sure if it is from some theoretical proof or not). Here we just follow his formula since this is one of the most popular sequences in Algorithm textbooks.

jdchoi77 commented 4 years ago

@lcunild we have discussed about this in the class; it's due to the size of each gap group where at least 3 keys to be included.