Open EmperorArthurIX opened 2 years ago
With binary search and jump search in mind we can also have a SortedOneDimentionalArray
as data structure which would take OneDimentionalArray
as input and would dynamically choose the sorting algorithm based on how far the array is already sorted which would come from its Hamming Distance (wheather it would be insertionSort
which works best for nearly sorted arrays and quickSort
if the array is not sorted).
Interesting idea. I think the idea of SortedOneDimensionalArray
should be discusses in a separate issue. Is there any prior work (like Wikipedia article, papers, or lecture notes available online) which can provide clarity and examples?
I have moved the conversation to Gitter once we reach a consensus I will create a separate issue.
Gitter
It's inactive. Folks use Discord these days a lot. Github issue would be the best place to discuss as it is accessible by anyone in the open source community.
And may be we should provide a function that should take an array as input and suggest one of the pydatastructs APIs as output. The time complexity should be O(n) because anything more won't be worth it. The constant in O(n) should also be very small (we can benchmark for that).
The above mentioned declarations are suggested, where the parameter
unsorted_input
can be used to explicitly learn from the user whether or not the inputarray
is sorted beforehand. IfTrue
, we must first sort the array within the search function, then search.
I think that if a searching algorithm require input array to be sorted then it should assume that. Otherwise we won't be able to guarantee lower time complexities of algorithms like binary search, exponential search. The overhead of sorting an array is non-trivial, in-fact even checking whether an array is sorted or not will take o(n) time which is large as compared to O(logn) promised by binary search. So, let's keep things simple and encapsulated for now. I would suggest something like this, search(input_array, algorithm)
where algorithm
would be a string (non-optional argument), binary_search
, linear_search
, jump_search
, exponential_search
, etc.
The above mentioned declarations are suggested, where the parameter
unsorted_input
can be used to explicitly learn from the user whether or not the inputarray
is sorted beforehand. IfTrue
, we must first sort the array within the search function, then search.I think that if a searching algorithm require input array to be sorted then it should assume that. Otherwise we won't be able to guarantee lower time complexities of algorithms like binary search, exponential search. The overhead of sorting an array is non-trivial, in-fact even checking whether an array is sorted or not will take o(n) time which is large as compared to O(logn) promised by binary search. So, let's keep things simple and encapsulated for now. I would suggest something like this,
search(input_array, algorithm)
wherealgorithm
would be a string (non-optional argument),binary_search
,linear_search
,jump_search
,exponential_search
, etc.
Noted, will think of solutions from that perspective while planning and coding.
The input should not necessarily be an array. A function should work too. An array is conceptually also a function (of indices). See this https://en.wikipedia.org/wiki/Ternary_search for example. Allowing inputs as function would make API more practical.
Shall I work on this one?
Sure go ahead.
@skullVoid has already opened a PR. Seems like he/she haven't linked it here yet.
Hey! I can solve the issue mentioned above. Please assign the work to me. I'll write the code of all searching algo with their complexity defined.
@varsha080 Sure go ahead. We don't assign issues. Please read, https://github.com/codezonediitj/pydatastructs/wiki/Issue-Policy
Hello! I would like to work on this issue under GSoC 2023. I can write the code of the searching algorithms in an efficient manner. Shall I work on this?
Is this issue still open? Can I work on this?
Yes. Please go ahead.
Just to clarify what the problem statement, We need to implement linear search algorithm, binary search algorithm, jump search algorithm and create a common search function where its arguments are array, key, unsorted_input, algorithm where unsorted_input is a parameter necessary for binary search and jump search, and optional for linear search. If unsorted_input is true we need to return sorted array with index of key in the sorted array. In linear search unsorted_input can be given to do the same. If key not found in array, it returns -1 in position and a sorted array if unsorted_input=True. Am I correct?
Please note https://github.com/codezonediitj/pydatastructs/issues/446#issuecomment-986186631 and go ahead with your work. Feel free to take decisions on the basis of your judgement. Thanks.
I would be happy to contribute to the project as I find my skills a good fit for the issue . As a GSSOC 23 member I humbly ask to allow me to contribute to the issue.
Description of the problem
The codebase is very extensive and well-designed; however, it does not seem to include searching algorithms for linear data structures as far as I could observe. I believe addition of searching algorithms would be a good improvement to the project. Certain algorithms I have thought can be included are:
The above mentioned declarations are suggested, where the parameter
unsorted_input
can be used to explicitly learn from the user whether or not the inputarray
is sorted beforehand. IfTrue
, we must first sort the array within the search function, then search.Comments
With certain modifications, we may even allow to return the sorted array along with search result, but I have no idea on how that will be done at the moment.
PS: I am currently working on the Radix Sort issue that is already open, and after that reaches a satisfactory stage, I wish to go on with this one. Discussions and suggestions are welcome.