Morwenn / cpp-sort

Sorting algorithms & related tools for C++14
MIT License
617 stars 57 forks source link

Improve Documentation for Users #203

Closed marc1uk closed 1 year ago

marc1uk commented 2 years ago

As everyone who's looking for the 'fastest sorter' knows, there's no universal answer, and the only way to find out what works best for any given situation is to try things and benchmark them. This project is a great resource for that! It provides access to a huge array of sorters, and the universal interfaces make it very easy to wrap up a range of implementations into a common interface, to assist with batch testing.

However. While very well documented compared to most github projects, there's a clear focus here on developers wishing to add their new algorithm into the framework. For the less expert layperson simply wishing to run the existing algorithms, a few examples would go a long way (especially thanks to the common interfaces). But those examples are unfortunately missing. I've been struggling for a while now to figure out how to actually use the indirect_adapter to sort a c-style array of objects. I can provide a projector method, or a lamdba, or a functor, or whatever is required.... but i don't know what's required. The documentation describes what it is, and how it works, but not how to use it. The only snippet of code is

template<typename Sorter>
class indirect_adapter;

which isn't relevant for me as i'm not using c++17, and ... well it just declares a template typename that doesn't get used, and a class. I don't even know what this is supposed to be telling me. Going from the example of the hybrid adapter on the homepage I tried (among many other things)

    cppsort::indirect_adapter< cppsort::pdq_sorter > sorter;
    MyObject* objects = new MyObjects[20000];
    ... fill array ...
    sorter(objects, objects+20000, std::greater<>{}, &MyObject::Get_Time);

but I just get a wall of template errors. The same happens with

   sorter(objects, objects+20000, [](const MyObject& obj){ return obj.Get_Time();} );

and

   sorter(objects, objects+20000, [](const MyObject& obj_a, const MyObject& obj_b){ return obj_a.Get_Time() < obj_b.Get_Time(); } );

Some of these sorters handle containers rather than ranges, for which it seems that using a sorter facade is required (which is again, not especially clear from the documentation), so i've also tried putting one of those into the adapter, but again no luck. I'm clearly shooting in the dark here.

Providing even a single example for each functionality would go a long way.

Morwenn commented 2 years ago

For what it's worth, I have no idea what's wrong with what you tried, it seems fine at a glance and I would have expected it to work. Humongous template barfs are unfortunately a known issue (see #28) and despite having tried a bunch of things, I haven't found an easy way to make the situation better without writing much more code. More often than not, a simple const issue turns otherwise valid code into error mayhem.

The documentation does lack at least a quickstart with simple examples of sorters or sorter adapters. I don't plan to give examples of all sorters or adapters since they're all supposed to accept the same parameters, and since they are wrapped into sorter_facade, listing the available operator() overloads would be redundant and not really helpful. That's why I tried to split the documentation into "nomenclature" and different patterns, with section introductions explaining the common parts of the components to follow: it does solve the redundancy problem, but it's true that it doesn't help beginners to just start using the library without reading lots of information.

I guess that I will write a quickstart that shows examples of cool things, trying to showcase the main features, but I won't write an example of each sorter or adapter since it all boils to almost the same thing with different names (for the useful ones at least) :x

Maybe I can help you find your issue if you don't mind sharing your test class and/or error messages. Improving the error messages is difficult, but another common helpful thing I could do would be to add a page to the documentation describing common errors and how to use them.

Thanks for your feedback anyway!

marc1uk commented 2 years ago

Thanks for writing the project! I agree that not much documentation would be required to be sufficient; a few examples like:

together with any notes on how those classes interact with other - for example, can one pass a facade to an adapter? Or an (indirect) adapter to another (verge) adapter?

Any specific help would be great if you don't mind. The class i'm trying to sort is actually a very simple object, but it packs in a few variables into a single byte array, so the key to sort on requires a cast and bitmask. The simplest version would be:

struct ID_OD_hit{
  unsigned char bytes[10];
  unsigned long Get_Time(){return (*((unsigned long*) &bytes[0]) ) & 281474976710655UL; }
  void Set_Time(unsigned long time){memcpy(&bytes[0], &time, 6);}
};

Scouring the source code I've seen examples using member variables (e.g. &ID_OD_Hit::bytes) as a projector, but that wouldn't work in this case due to the masking required (it also fails to compile). It's unclear to me whether the projection can be a member function though - I think so...but i can't get one to compile. It's also not clear on exactly how to pass comparators or projectors, or both, or what signatures they should have (should they accept pointers, or iterators, or objects via reference...?). Removing the const qualifier from the comparator version didn't help, unfortunately.

The only simplification the above makes is that ideally I would subtract off a baseline time before sorting; which would require either a lambda:

unsigned long basetime;
auto projector = [basetime](const ID_OD_Hit& myhit){ return (myhit.Get_Time() - basetime); }
sorter(hitarray, hitarray+arraysize, projector);

or a functor

struct projector {
    projector(unsigned long btime) : basetime(btime){};
    unsigned long operator()(const ID_OD_Hit& myhit) const { return myhit.Get_Time() - basetime; }
    unsigned long basetime;
};
projector myprojector(42);
sorter(hitarray, hitarray+arraysize, myprojector);

but i can't get any of these to compile.

Unforunately as you say, the excessive templating makes it essentially impossible to debug - some versions I tried produced over 8,500 lines of errors for one bad call! :exploding_head:

Morwenn commented 2 years ago

Thanks for writing the project! I agree that not much documentation would be required to be sufficient; a few examples like:

  • here's an example of sorting an array, a container, some iterators, for basic types
  • here's an example of using a facade, and some cases where it would be necessary
  • here's an example of using an adapter (e.g. along with any specific differences in usage of the different adapters, or their use with containers/arrays/iterators)
  • here's an example of sorting objects, with a projector member, or method, free function, functor...

together with any notes on how those classes interact with other - for example, can one pass a facade to an adapter? Or an (indirect) adapter to another (verge) adapter?

Those things should be covered by a quickstart yeah, especially since they are theoretically rather simple to use. For example adapting a sorter generally yields another sorter, so you can just adapt it again without more ceremony. sorter_facade is only a tool for sorters implementers, and simple users shouldn't have to care, but it would be worth mentioning.


Concerning your issue I'm honestly surprised, I compiled and ran the following code without a problem:

struct ID_OD_hit{
    unsigned char bytes[10];
    unsigned long Get_Time(){return (*((unsigned long*) &bytes[0]) ) & 281474976710655UL; }
    void Set_Time(unsigned long time){memcpy(&bytes[0], &time, 6);}
};

int main()
{
    cppsort::indirect_adapter<cppsort::pdq_sorter> sorter;
    ID_OD_hit* objects = new ID_OD_hit[20000];
    //... fill array ...
    sorter(objects, objects+20000, std::greater<>{}, &ID_OD_hit::Get_Time);
}

What version of the library and what compiler are you using?

marc1uk commented 2 years ago

hmm, i'm certain i tried the pdq_sorter just to see if the sorter was the issue, but something else must have been wrong at that time, as that does seem to work okay. In fact, it seems like i've been trying with the single combination that doesn't work - using the spread_sorter with an indirect_adapter throws all sorts of errors. The pdq_sorter works with the indirect, verge and schwartz adapters, and all my attempts at projectors and comparators. The spread_sorter is fine with a schwartz or verge adapter, but fails with an indirect adapter.

Morwenn commented 2 years ago

Interesting, I believe indirect_adapter<spread_sorter> should work when only given a projection because it doesn't strictly require the comparison to do its job, but I've got to admit that it doesn't compile. I'll move that part to another issue tomorrow and see whether I can get it fixed for the next version.

Morwenn commented 1 year ago

Hi, it took longer than expected, but I finally added a quickstart page to the library: https://github.com/Morwenn/cpp-sort/blob/develop/docs/Quickstart.md

If you don't mind, could you review it briefly and tell me whether it answers the questions you had when you tried using the library? Thanks again for taking the time to report the issue 🙂