j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
178 stars 15 forks source link

Sorting #101

Open jacobwilliams opened 4 years ago

jacobwilliams commented 4 years ago

I wonder if sorting should be built into the language? This is such a fundamental thing that currently take a staggering amount of code to make it work with all intrinsic types and user-defined types (see for example here). I think C++ has sorting built in to its standard library.

Features:

I think it goes without saying that any well-designed future generic/templating feature should also allow one to apply a sorting algorithm (either one from a third-party library or maybe even the hypothetical intrinsic one) to a user-defined type with a minimum of code.

certik commented 4 years ago

@jacobwilliams indeed. We had to implement our own (https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/sorting.f90) and it's not very efficient. This would be a great addition.

rweed commented 4 years ago

agree 100% about the need for sorting routines. Since there are numerous examples of quicksort etc. in the literature, implementing this in a standard library should be a no-brainer. I would point to the code in Robin Vowels excellent book "Algorithms and Data Structures in F and Fortran", Dick Hanson and Tim Hopkins book "Numerical Computing with Modern Fortran", and Ajen Markus's FLIBS implementation of quicksort. Also, if Robin Vowels is following this and the other issues on this site, I would love an updated version of your book focused on Modern Fortran implementations of the algorithms and data structures with additional coverage of things like octrees etc. I would buy it yesterday.

certik commented 4 years ago

Let's implement good sorting routines in stdlib (#104). If we make stdlib successful, that would get us 90% there. There might be some language features that would be helpful, but we can have that discussion later.

leonfoks commented 4 years ago

I have a decent introspection sort in Coretran. Right now it's interfaced and operates on contiguous memory primitive arrays, real32, real64, int32, int64. I have argsort too. I went further than a quicksort to add robustness for edge cases. Here's the description from my module.

!! Uses an introspective sort on a set of numbers. See this "http://www.geeksforgeeks.org/know- your-sorting-algorithm-set-2-introsort-cs-sorting-weapon/" for more information !! !! To begin, a quicksort with a median of three pivot is used until the size of the array is less than 16. At this point, an insertion sort is used to reduce cache overhead and tail recursion. !! Unfortunately, a quicksort is not ideal for sorted/almost sorted arrays or arrays with duplicate values. Therefore if the number of iterations exceededs a threshold, a heapsort is used instead. !! This provides a robust sorting algorithm that is still very fast for almost sorted arrays. !! !! In this implementation, the quicksort and heapsort are unstable sorts. A stable merge sort is therefore provided as an alternative but it has an order(N) memory overhead. !! !! Often, the numbers wish to be maintained in their given order, so with an O(N) memory overhead we can sort an integer array instead by calling argsort()

The stable merge sort doesn't really matter for numbers, will be important if used for arbitrary types though.

leonfoks commented 4 years ago

Also related, maybe it should be a separate issue.

Should searching algorithms be part of the stdlib? Binary search or interval search? I use a binary search for a lot of functions in Coretran (again interfaced for the different primitives), including KdTree, the dynamic arrays, inserting into a sorted list etc.

certik commented 4 years ago

I think searching algorithm would also fit into stdlib.