Rework lambda mechanism to improve speed and usability

odinthenerd commented 8 years ago

moved here to stop spamming twitter

Edit:

Relevant tweet: https://twitter.com/odinthenerd/status/740900401340354560

odinthenerd commented 8 years ago

definition of terms: selector = what args is now, the _1, _2 etc. tags lambda = a template with place holder selectors in it or any of its children aggregate = a template with no placeholders in its parameters or their parameters recursively

a lambda is basically a tree of types with either selector types or non template types at the leafs. we really don't have any way to tell if a template type we encounter is a lambda or just an aggregate so we test now. If we were to write a recursive function which walks the tree unpacking types and swapping out the selector types in such a way that it does not matter if it is walking a lambda or an aggregate we would not have to test and this could (my gut tells me) run faster.

we need a way of treating lambdas as types 'in current context' which we call deferr or protect or something else depending on the MPL now. If we were to specialize our walker to unwrap but terminate drill down on such types we could use that tag to minimize walking into branches which have no selectors (say my lambda is looking for a match to a old school type list, with no tag we have no way of knowing whether to drill down to the end of that list looking for a placeholder).

For that matter we could force the user to tag any non lambda argument, then drill down would be lightning fast and we wouldn't need protect which can be confusing to stupid users like me at first. maybe we could call it _p as in:

using l = list<list<int>, int, list<list<int>, int>>;  //don't want to walk this
using res = find < haystack, or_<
    std::is_same<_p<l>, _1>, 
    std::is_same<_p<std::is_pod<_1>>, _1> //we are looking for a match for is_pod, not executing it
>>;

odinthenerd commented 8 years ago

note: updated code

namespace brigand{
    template<typename T>
    struct _p {};
    template<std::size_t N>
    struct selector{};

    namespace detail {
        template<typename T, typename...Ts>                                         //user error
        struct call_impl {
            using type = T;
        };

        template<template<typename...> class F, typename... Fs, typename...Ts>      //lambda
        struct call_impl<F<Fs...>, Ts...> : F<typename call_impl<Fs, Ts...>::type...> {};

        template<template<typename...> class F, typename T>                         //lambda single arguement fast track
        struct call_impl<F<selector<0>>, T> : F<T> {};

        template<template<typename...> class F, typename T, typename U>             //lambda bind first fast track
        struct call_impl<F<selector<0>, _p<U>>, T> : F<T,U> {};

        template<template<typename...> class F, typename T, typename U>             //lambda bind second fast track
        struct call_impl<F<_p<U>, selector<0>>, T> : F<U,T> {};

        template<std::size_t N, typename...Ts>                                      //placeholder (should be fast tracked)
        struct call_impl<selector<N>, Ts...> {
            using type = at_c<list<Ts...>, N>;
        };

        template<typename T0, typename...Ts>                                        //placeholder (should be fast tracked)
        struct call_impl<selector<0>, Ts...> {
            using type = T0;
        };

        template<typename T0, typename T1, typename...Ts>                                       //placeholder (should be fast tracked)
        struct call_impl<selector<1>, Ts...> {
            using type = T1;
        };

        template<typename T, typename...Ts>                                         //non lambda parameter
        struct call_impl<_p<T>, Ts...> {
            using type = T;
        };
    }

    template<typename T, typename...Ts>
    using call = typename detail::call_impl<T, Ts...>::type;                        //calls a lambda 
}
namespace lazy {
    template<typename T, typename U, typename V>
    struct transform;
    template<template<typename...> class L, typename...Ls, typename U>
    struct transform<L<Ls...>, U, void> {
        using type = L<brigand::call<U, Ls>...>;
    };
    template<template<typename...> class L, typename...Ls, typename U, typename V>
    struct transform<L<Ls...>, U, V> {
        using type = L<brigand::call<U, Ls>...>;
    };
}

template<typename T, typename U, typename V=void>
using transform = typename lazy::transform<T, U, V>::type;

Any template with non type parameters is magically interpreted as a non-lambda, only problem is that all types which are templates taking only type parameters must be wrapped with a _p<>, so breaking change, but should be blazing fast.

Actually I kind of like wrapping non lambda parameters because it makes the situation where you want to pass a lambda as a type more logical. On the other hand we could wrap all lambdas but then how would you express that a lambda should be handled as a type in this context but a lambda in the next? From my own experience is lambdas are more common than non lambdas in this context so it would make sense to wrap the less common case.

Anything that gets us away from the nested apply template should be a step in the right direction, template members of template classes seem to be a dusty corner.

brunocodutra commented 8 years ago

If I may weigh in, I totally agree with @porkybrain's view. The distinction between a lambda expression and a list based on the presence of a placeholder is very artificial. What if one needs a constant lambda expression, i.e., one which is bound to fixed parameters?

That said, I feel that providing support for lambda expressions, which are lazy by definition, doesn't make much sense given that the rest of Brigand is eager, but that's just my opinion of course.

brunocodutra commented 8 years ago

Anything that gets us away from the nested apply template should be a step in the right direction, template members of template classes seem to be a dusty corner.

Perhaps you should consider something along the same lines of Metal. It never relies on nested types/templates, rather an invocable is any type that matches the following signature (or will be after I merge brunocodutra/metal#32 which moves entirely to an eager design):

template<template<typename...> class> struct invocable{};
template<typename...> using f = /* ... */;

using result = metal::invoke<invocable<f>, args...>; // -> f<args...>

Internally metal::invoke extracts f out of invocable<f> by matching its signature.

odinthenerd commented 8 years ago

So your are on the other side of the wrap lambdas vs. wrap non lambdas debate ;) somehow I just like the syntax transform<l,std::add_pointer<_1>> too much.

brunocodutra commented 8 years ago

So your are on the other side of the wrap lambdas vs. wrap non lambdas debate

I'm not sure which side is which, but judging from your implementation above, I’d say I’m on your side. Basically I favour the Lisp notation, i.e., a list is an expression unless quoted (protect is called quote in Metal BTW).

somehow I just like the syntax transform<l,std::add_pointer<_1>> too much

It is neat indeed and the sole reason why Metal hasn't gone eager long ago, but I finally gave up on lambda expressions in favour of the slightly more verbose, but strictly equivalent transform<l, bind<lambda<std::add_pointer_t>, _1>>

jfalcou commented 8 years ago

Multiple points:

Brigand is either lazy or eager. YOu have the choice brigand::foo is eager, brigand::lzay::foo is lazy. YOu need lazyness for higher order compositionnability
I'm open to w/e break we need as long as the syntax is not horrible

brunocodutra commented 8 years ago

YOu need lazyness for higher order compositionnability

I wouldn't go as far as to say lazyness is required for composability, but I agree it makes it terser.

ericniebler commented 8 years ago

Meta has lambdas but avoids them internally and discourages their use because they are too heavyweight. Meta prefers transform<l, quote<std::add_pointer_t>> to the lambda equivalent of transform<l, lambda<_a, defer<std::add_pointer_t, _a>>>

EDIT: And to stress-test my lambda implementation, I use this:

namespace l = meta::lazy;
template <class L>
using cart_prod = reverse_fold<
    L, list<list<>>,
    lambda<
        _a, _b,
        l::join<l::transform<
            _b, lambda<_c, l::join<l::transform<_a, lambda<_d, list<l::push_front<_d, _c>>>>>>>>>>;

What does the cartesian product look like in Brigand if you write it with lambdas?

odinthenerd commented 8 years ago

Totally agree with Eric, plus I like the bind stuff in meta. Actually though when adding pointers to 1000 unique intigral_constants my call function is only 5% slower than metas transform<l, quote<std::add_pointer_t>> so it is much faster that current brigand.

The more I think about it though the more I think assuming eager will end up faster all be it uglier than assuming lazy. We may have to make a fundamental decision at some point how much noise we are willing to put up with in the name of speed.

I will try to get some more bench marks done, hard to concentrate at home though, my son is teething.

ericniebler commented 8 years ago

In meta, lambdas are really never needed with bind_frontand bind_back and the occasional custom template alias. And without lambdas, there's rarely the need for lazy evaluation. I'm convinced eager is the right default.

edouarda commented 8 years ago

My work here is done, I can now resume hitting F5 on the stars counter.

odinthenerd commented 8 years ago

I updated my algorithm (see second post in this thread) adding a few fast tracks and now its beating meta (by a hair) on clang 3.7 using real lambda syntax (at least in my test). For my (very unscientific) test I have a 1000 element list of unique intigral_constants which gets transformed in different ways. I comment in only one of the tests below and use the compiler diagnostic to tell the compilation time (I cranked up verbosity).

using a = meta::transform <l, meta::bind_front<meta::quote<is_same_t>, float>>;
using b = transform<l, std::is_same<_1,_p<float>>>;
using c = meta::transform < l, meta::quote<std::add_pointer_t> >; 
using d = transform<l, std::add_pointer<_1>>;

Let me know if there is a faster way to do a, I'm not good at meta yet. In both cases meta was a few percent slower. Maybe we don't need binding after all? (or maybe I overlooked something, happens often enough ;))

edouarda commented 8 years ago

Maybe we can improve the lambda plumbing to make it faster?

odinthenerd commented 8 years ago

I think the _p<> is key, it keeps us from having to walk through types which are not lambas just to make sure they aren't and more importantly allows us to fast track in the common trivial case of one parameter. That and getting rid of the nested apply template. For example b is like 10x faster with my new algorithm than with current brigand. I would like to benchmark against meta in the more complicated Cartesian product case but first I have to go read what the hell a Cartesian product is in the first place ;) (damn educated elites and their fancy named algorithms)

edouarda commented 8 years ago

Why "_p" ? What does it mean?

Edit: sorry saw you earlier post

edouarda commented 8 years ago

You are not protecting the parameter, you are hinting brigand that it's not a lambda. In other way, you are pinning the parameter, what about brigand::pin<> ?

odinthenerd commented 8 years ago

the idea would be if you use _p on a lambda it would have the same effect as brigand::protect has now, so there is no need for many different flavors of protection. I don't really care what its named, _p is a working title.

I just realized why Eric names the parameters of his lambdas rather than just taking positional args. Positional can be ambiguous when cascading lambdas... hmm now my brain hurts, I think we need to do that too (or do we already and I'm just stupid again, whats the brigand equivalent of erics cart_prod above??).

odinthenerd commented 8 years ago

Its probably not the right place to argue on behalf of sexiness but I would like to have the power of meta lambdas but the terseness of transform<l, std::add_pointer<_1>>, have it all be the same thing, at least conceptually and have my cake and eat it too, and a puppy for that matter.

Jokes aside maybe we could make an argument alias for the case where you need to reference a lambdas arg from a nested lambda, when walking we could just tack on the extra args to the args we already have and have the alias be a special selector. Problem with that is we would have to start wrapping things again which kills the sexiness. This would also screw up short circuiting but only in the case that it is used.

We could make a 'super<_1>' tag which would select the arg from its "superlambda" as in one level up, and pass all the args upstream while walking, this would make the carteasan product look like this

template <class L>
using cart_prod = reverse_fold<
    L, list<list<>>,
    l::join<l::transform<
            _2, _p<l::join<l::transform<super<_1>, _p<list<l::push_front<_1, super<_1>>>>>>>>>>>;

brunocodutra commented 8 years ago

Regarding the implementation of the cart_prod, wouldn't it be better to nest calls to transform?

That's how it would look like using lambda expressions in current [lazy] Metal

template<typename x>
using cart_prod_t = apply_t<
  lambda<join>,
  transform_t<
    transform<list<quote<_1>, quote_t<_1>>, quote_t<x>>, 
    x
  >
>;

Which I admit is much more readable than the equivalent implementation in [eager] Metal as of currently standing brunocodutra/metal#32

template<typename x>
using cart_prod = apply<
  lambda<join>,
  transform<
    bind<lambda<transform>, bind<lambda<bind>, quote<lambda<list>>, bind<lambda<quote>, _1>, quote<_1>>, quote<x>>, 
    x
  >
>;

Well, that's the price I'll have to pay for refusing to support an eager/lazy hybrid.

odinthenerd commented 8 years ago

yes there do seem to be trade offs between different implementations of MPLs which is why I called attention to the fact that some hybrid of current MPLs will no doubt be standardized and we really should get that right. Lambdas do seem to be the crux too, as far as I can see standardizing a public interface of most algorithms does not dictate their implementation and is therefore not so important, with lambdas it is... I do think speed is pretty important because everyone who does not need speed will be using hana style. Library implementer and infrastructure builders need speed at all cost. I would like to establish a baseline of how fast we can make sexy run before we start making it ugly in the name of speed. By the way although I seldom agree with you I am very glad there are others with different views out there. What is the advantage to being all eager?

edouarda commented 8 years ago

Speed is paramount and one of the main goal of brigand. Because meta programming libraries are a fundamental building blocks, there is an amplification effect of their compilation speed. If we deliver a lightweight, instant compile time metaprogramming library, library will be more open minded to include metaprogramming into their arsenal.

edouarda commented 8 years ago

I agree however we should make it easy of use first, then make it fast, to the limit it has to remain comfortable to use.

ericniebler commented 8 years ago

I suspect you'll find that if you try to standardize meta-lambdas at all, you'll make your job 10x harder.

edouarda commented 8 years ago

I agree with @ericniebler I think we should bring a solution that's convenient for most use cases. If I grep quasardb source code, we have little to no complex meta-lambda.

odinthenerd commented 8 years ago

basically every algorithm takes a lambda/callable, if you don't standardize the 'callable' interface then whats left to standardize in an MPL?

ericniebler commented 8 years ago

In meta, Callable is just a class type with a nested invoke template alias. It doesn't need to be more complicated. EDIT: in fact, in Peter Dimov's world, it's even simpler. Just pass template template parameters.

edouarda commented 8 years ago

@porkybrain I think we shouldn't obsess with standardization but more about making the library useful. Standardization will appear as the libraries mature.

odinthenerd commented 8 years ago

It makes it so that you have to wrap everything that does not have an invoke template which pollutes the API and I'm still not convinced its the fastest solution out there.

ericniebler commented 8 years ago

Template template parameters are the fastest solution or there. Aside from that, sprinkling meta::quote here and there seems no worse to me than sprinkling brigand::_p here and there.

brunocodutra commented 8 years ago

By the way although I seldom agree with you I am very glad there are others with different views out there. What is the advantage to being all eager?

@porkybrain Being either all eager or all lazy, doesn't matter which, has the great advantage of keeping the API concise and symmetrical. See, users have a very limited attention span (speaking from personal experience) and it is very important that a library be easy to grasp (especially if you think about standardization). Metal for instance is entirely built upon 5 very intuitive and straightforward concepts, which can be defined by no more than a couple of lines of text each: Value, Number, Lambda, List, Pair and Map.

Now regarding the verbosity of eager composition that I mentioned before, I found a very simple solution that does not require lazy metafuncions and I can't believe it didn't occur to me before. One only needs to provide aliases such as the following (notice the Capital letter)

template<typename lbd, typename... seqs>
using Transform = bind<lambda<transform>, lbd, seqs...>;

template<typename lbd, typename... lbds>
using Bind = bind<lambda<bind>, lbd, lbds...>;

/* ... */

... and voilà! We have the terseness of lambda expressions without a single lazy metafunction (compare with the previous examples)

template<typename x>
using cart_prod = apply<
  lambda<join>,
  transform<
    Transform<Bind<quote<lambda<list>>, Quote<_1>, quote<_1>>, quote<x>>, 
    x
  >
>;

EDIT: in fact, in Peter Dimov's world, it's even simpler. Just pass template template parameters.

But then you lose the ability to compose and along with it the joy of TMP.

brunocodutra commented 8 years ago

Template template parameters are the fastest solution or there.

@ericniebler That's the approach of [eager] Metal, but in order to retain the ability to compose, they are simply wrapped within a type of appropriate signature:

template<template<typename...> class> struct wrapper {};

using callable = wrapper<std::add_pointer_t>; // anything that matches this signature is callable

ericniebler commented 8 years ago

I found a very simple solution that does not require lazy metafuncions and I can't believe it didn't occur to me before. One only needs to provide alias such as the following ...

You've discovered meta::defer. :-) Everything in Meta's lazy namespace is a type alias defined in this way with defer.

in order to retain the ability to compose, they are simple wrapped within a type of appropriate signature

In Meta, wrapper is called quote. We keep rediscovering the same things over and over.

brunocodutra commented 8 years ago

You've discovered meta::defer. :-) Everything in Meta's lazy namespace is a type alias defined in this way with defer.

Hmm, please correct me if I'm wrong, but when I last took a look at Meta, it seemed to me that meta::defer was rather a lazy adaptor for eager metafunctions, i.e., conditionally defining ::type to be the result of the invocation so as to make sure it is SFINAE friendly. What I propose is somewhat different, as there won't be any nested ::type anywhere. In my example above Transform<F, L>::type doesn't mean anything as it is noy a lazy metafunction, but only an adaptor from lambda expressions to strictly equivalent, but much more verbose, bind expressions.

In Meta, wrapper is called quote. We keep rediscovering the same things over and over.

Indeed, I'd noticed it before, the difference is that Meta relies on meta::quote to define a nested invoke<> much like MPL, whereas Metal allows any type that matches that very signature to be invoked by simply extracting the metafunction by template matching at the point of instantiation.

BTW, here goes a heads up for the performance aficionados, I compared extracting the metafunction from the type signature against relying on a nested template and I seem to have observed some performance penalty for the latter both on GCC and on Clang. In fact I observe the same thing when accessing any nested element of a type, especially std::integral_constant's ::value, when compared to matching its template signature to extract the value directly.

pfultz2 commented 8 years ago

Hmm, please correct me if I'm wrong, but when I last took a look at Meta, it seemed to me that meta::defer was rather a lazy adaptor for eager metafunctions, i.e., conditionally defining ::type to be the result of the invocation so as to make sure it is SFINAE friendly.

You could create a defer alias that would do this, then:

template<template<class...> class Template, class... Xs>
using defer = bind<lambda<Template>, Xs...>;

It is terser than having to write bind<lambda<transform>, lbd, seqs...>.

brunocodutra commented 8 years ago

You could create a defer alias that would do this, then:

Indeed, I thought about that as well, but I found it didn't help very much on the readability side.

template<typename x>
using cart_prod = apply<
  lambda<join>,
  transform<
    defer<transform, defer<bind, quote<lambda<list>>, defer<quote, _1>, quote<_1>>, quote<x>>, 
    x
  >
>;

Perhaps I should provide it for easy integration of user defined metafunctions, but still provide lambda expression friendly aliases on top of defer for every construct in Metal. It's literally just a couple of lines per metafunction anyways.

template<typename lbd, typename... seqs>
using Transform = defer<transform, lbd, seqs...>;

Back to the topic, call me obsessive compulsive about symmetry, but I'm now totally convinced a hybrid implementation is both wonky and entirely unnecessary. Just my 2 cents.

ericniebler commented 8 years ago

Perhaps I should provide it for easy integration of user defined metafunctions, but still provide lambda expression friendly aliases on top of defer for every construct in Metal.

This is what meta does.

I compared extracting the metafunction from the type signature against relying on a nested template and I seem to have observed some performance penalty for the latter both on GCC and on Clang.

Mere mentions of an instantiation (via an alias) are less expensive than mentions of a nested identifier, since that forces the instantiation of the enclosing type. That is, X is cheaper than X::Y.

brunocodutra commented 8 years ago

Perhaps I should provide it for easy integration of user defined metafunctions, but still provide lambda expression friendly aliases on top of defer for every construct in Metal.

This is what meta does.

Cool! I guess we agree on the API in general, albeit not so much on the implementation.

It seems to me that if we were to regard Meta's use of nested invoke<> and type strictly as hidden implementation details, then Meta and [eager] Metal should look very similar from the user's perspective. On the other hand I posit that if you try to document these as part of Meta's customization points, you'll find it rather challenging to summarize the API in a concise and symmetrical manner.

@porkybrain Why instead of implementing lambda expressions directly, don't you simply provide lambda expression friendly aliases just like Metal?

namespace lazy {
    template<typename... Ls>
    using append = brigand::bind<brigand::quote<brigand::append>, Ls...>;

    template<typename L, typename F>
    using tranform = brigand::bind<brigand::quote<brigand::tranform>, L, F>;

    /*...*/
}

By doing this you can write lambda expressions out of the box using the existing mechanism for invoking callables. Notice you have lazy semantics without a single lazy metafunction.

brunocodutra commented 8 years ago

Mere mentions of an instantiation (via an alias) are less expensive than mentions of a nested identifier, since that forces the instantiation of the enclosing type. That is, X is cheaper than X::Y.

What I meant to say is that

template<typename N>
using not_ = bool_<!N::value>;

seems to be much slower than

template<typename N>
struct not_impl {};

template<typename T, T v>
struct not_impl<std::integral_constant<T, v>> : bool_<!v> {};

template<typename N>
using not_ = typename not_impl<N>::type;

despite the fact both internally mention a nested entity.

ericniebler commented 8 years ago

That's surprising if it's true. I wonder what's going on there.

I'll note that the latter implementation is broken by code like the following:

struct MyFalse {
  using type = MyFalse;
  static constexpr bool value = false;
};

using MyTrue = not_<MyFalse>; // ERROR

That is too fragile for my taste.

brunocodutra commented 8 years ago

That's surprising if it's true. I wonder what's going on there.

I think the difference is that by accessing the nested ::value of an std::integral_constant you force the instantiation of a type which defines lots of non-trivial nested entities, ~~whereas by matching the type signature you only force the instantiation of not_impl which defines a single nested ::type and nothing else~~. That's wrong actually, since not_impl inherits from std::integral_constant. I'm clueless then.

I'll note that the latter implementation is broken by code like the following:

I should have mentioned that a Number in Metal must be an instance of std::integral_constant, i.e. MyFalse in your example above is not a Number in Metal parlance. As a matter of fact, every concept in Metal requires a very specific type signature, which allows concept checking by partial template specialization and makes it rather trivial to provide traits such as is_number, is_map, etc.

ericniebler commented 8 years ago

As a matter of fact, every concept in Metal requires a very specific type signature, which allows concept checking by partial template specialization and makes it rather trivial to provide traits such as is_number, is_map, etc.

Forgive the rather impolite question, but ... if you make these design compromises to make Metal fast, then why is it so slow? http://ldionne.com/metabench/

brunocodutra commented 8 years ago

As a matter of fact, every concept in Metal requires a very specific type signature, which allows concept checking by partial template specialization and makes it rather trivial to provide traits such as is_number, is_map, etc.

Forgive the rather impolite question, but ... if you make these design compromises to make Metal fast, then why is it so slow? http://ldionne.com/metabench/

Don't wrorry, that's not an impolite question, since I don't strive to beat the competition in terms of performance. I don't make those design choices to make Metal fast, it's the other way around, I make them in order to allow strict concept checking. It just so happens that in this particular case this seems to be the fastest alternative, at least in my experience.

Now to answer your question I'll have to dive a bit into the internals of Metal. Basically, every algorithm in Metal that can’t be expressed in terms of transform is expressed in terms of fold. Now the usual implementation of fold implies a linear growth of the template instantiation depth, which is limited to 256 on Clang and to 900 on GCC by default and thus requires the user to provide special flags in order to go beyond that limit. I didn't want that. Another issue with that approach is that memory consumption also grows very rapidly. I know that could be worked around by using fast-tracking, but that’s something I refuse to do in the name of code clarity and maintainability.

Instead, I chose to implement fold in a way that implies logarithmic growth of the template instantiation depth. Basically it keeps track of two indices beg and end and recursively folds each half of the list. Another advantage of this implementation is that you also have reverse_fold by choosing beg to be greater than end and partial_fold by chosing the indices to be a sub-range of the list. Now the problem with this approach is that it necessarily relies on at to actually access each and every element, since it only keeps track of indices. That is what hurts the performance so bad, especially because core algorithms such as join must rely on fold.

At least it turns out this approach is much less memory hungry. Perhaps Metal will look better on Metabench once we manage to provide memory benchmarks.

EIDT: BTW, [eager] Metal is slightly faster than it looks like on Metabench right now, since I haven't merged the relevant PR yet, however it will never be able to beat Meta and Brigand unless this proposal by @ldionne makes it into Clang eventually. Perhaps I should look into that as well.

odinthenerd commented 8 years ago

@brunocodutra if everything relies on fold I don't see the harm in fast tracking it, surly fast tracking one algorithm will not hurt maintainability so much that it outweighs the speed advantage.

edouarda commented 8 years ago

Our users want speed and fast tracking delivers that.

brunocodutra commented 8 years ago

There's no arguing fast-tracking delivers speed and of course it seems reasonable to back core algorithms by it, however so far I've been reluctant to do so for many reasons:

I really like the logarithmic recursion. It is elegant, generic and unique among present day TMP libraries.
It eats less memory.
So far I've been focusing on designing the perfect API and I spare no complete rewriting if I think I can make things the slightest more elegant, so don’t want implementation details to get in my way.
I don’t want to pay for what I don’t need and so far Metal is fast enough for its use cases.
Brigand and Meta already do a good job on delivering speed, if Metal goes fast-tracking it will lose its identity and become just another face in the crowd.
My design choice regarding fold is the most emblematic regarding the elegance x speed dilemma, but of course there are other. Notably, I always prefer to reuse algorithms to implement new ones so that out of the dozens Metal provides the vast majority are two liners that leverage on the few (< 10) core algorithms. If fold goes fast-tracking, the next logical question will be why implementing push_back and push_front in terms of join if they can be implemented directly just as easily (albeit more verbosely) and overnight Metal triples in size. It’s a Pandora’s box and that’s what I mean when I refer to maintainability.

Please don't get me wrong, I understand Brigand has its use cases and sheer speed might be required, I just don't agree this is the only logical approach to the TMP problem.

odinthenerd commented 8 years ago

I have been testing my lambda idea against meta with clang and 100 unique instantiations of this tree: using t0 = meta::list<meta::list<i<10>, i<20>>, meta::list<i<30>, i<40>>>;

first test just adding a pointet to every sub list, does not use meta::lambda, lightning fast: meta 50ms

template<typename T>
using tr = transform < T, quote<std::add_pointer_t> >;

new brigand 50ms

template<typename T>
using tr = transform<T, std::add_pointer<_1>>;

I actually can't claim I understand the reason why these are equivalent, seeing as new brigand has more instantiations, on the other hand it does not use any nested template anything (meta uses a nested invoke alias). My gut would tell me meta should be faster here but even if I use other functions besides add_pointer and increase the size or amount of trees by 10x the two seem identical or at least within the noise of each other.

now how about adding pointers to the i's nested in the sub lists (cascading):

meta using lambda 590ms (can I write this in a faster way?)

namespace l = meta::lazy;
template<typename T>
using tr = transform<T, lambda<_a, l::transform< _a, quote<std::add_pointer_t>>>>;

my new brigand lambda equivalent 150ms (nested lambdas that can use parameters of their parents are wrapped with _l, in this case we are not actually using the parent args)

template<typename T>
using tr = transform<T, lazy::transform<_1, _l<std::add_pointer<_1>>>>;

to test cascading lambdas I wrote my own do nothing metafunction:

template<typename T, typename U>
struct myLazyF {
    using type = i<(meta::size<T>::value + U::value)>;
};
template<typename T, typename U>
using myF = typename myLazyF<T, U>::type;

meta lambda actually using parents args 1710ms

template<typename T>
using tr = transform<T, lambda<_a, l::transform< _a, lambda<_b, defer<myF, _a, _b>>>>>;

new brigand lambda equivalent 300ms (parent args must be wrapped with parent<>, parent<parent<parent<...>>> is possible for the freaks among us)

template<typename T>
using tr = transform<T, lazy::transform<_1, brigand::_l<myLazyF<parent<_1>, _1>>>>;

In conclusion I think a unified syntax might be possible where brigand is the same speed in cases where meta uses "invokables" for speed but also has the flexibility of meta lambdas (with a considerable performance edge).

edouarda commented 8 years ago

That sounds pretty awesome!

jfalcou commented 8 years ago

Do we have a migration plan then ?

odinthenerd commented 8 years ago

I found a case for meta, if I write my own struct which contains an invoke and that invoke does not need to call a metafunction as in:

struct add_pointer_f {
    template<typename T>
    using invoke = T*;
};

meta is faster. However I would argue that that is probably a corner case as it is only possible when wrapping a type or adding type qualifiers to it. In all other cases you need specialization of some kind or another which means an alias alone will not work.

pattern matching F<_1> can actually be done at algorithm level, matching for something that has an "invoke" is not possible. Therefore we could optimize for the "common case" of F<_1> to try and make the gap smaller.

I still want to get this right before we do any migration, I think there may be a way to wrap meta invokables in a way that we can still profit from their speed. Plus my implementation probably has some corner cases where it has bugs right now.

I would also like to get variadic meta lambdas working (as in _args rather than _1), however I'm not sure if I really know how to get the public interface to not suck.

edouarda / brigand

Rework lambda mechanism to improve speed and usability #177