TESSEorg / ttg

TTG: Template Task Graph C++ API
18 stars 12 forks source link

RFC: more verbose, more readable TT construction #264

Open devreal opened 1 year ago

devreal commented 1 year ago

Based on a discussion between @bosilca, @therault and myself, I started to extend the API to improve readability. This gets us a bit closer to the style of TaskFlow. Example:

The old way:

    ttg::Edge<int, float> a2b, b2a;
    /* construct without edges */
    auto tta = ttg::make_tt([](int key, float value){}, ttg::edges(a2b), ttg::edges(b2a), 
                            "A", {"b2a"}, {"a2b"});

    auto ttb = ttg::make_tt([](int key, float value){}, ttg::edges(b2a), ttg::edges(a2b), 
                            "B", {"a2b"}, {"b2a"});

The new way:

    ttg::Edge<int, float> a2b, b2a;
    /* construct without edges */
    auto tta = ttg::make_tt([](int key, float value){ ... });
    tta->set_input<0>(b2a, "b2a")
       .set_output<0>(a2b, "a2b")
       .set_name("A");

    auto ttb = ttg::make_tt([](int key, float value){ ... });
    ttb->set_input<0>(a2b, "a2b")
       .set_output<0>(b2a, "b2a")
       .set_name("B");

This patch preserves the old way and adds the partial construction. The named member functions provide context for the arguments and help structure the code, at the cost of being more verbose.

Notes:

1) We cannot support generic functors (i.e., all arguments have to be explicitly typed) 2) We select the first functor argument as the key. If there are no functor arguments or key is void it has to be specified explicitly as first template argument to make_tt. 3) Input edges are still type checked based on the argument types. Output edges can be type-checked if the functor takes the output terminal tuple (not implemented yet). Otherwise type checking is done in debug builds via dynamic casts. 3) Not yet implemented for MADNESS. Just wanted to put out a PoC for discussion.

bosilca commented 1 year ago

Aren't you missing the ttb in the old way ?

devreal commented 1 year ago

Ahh right, added.

devreal commented 1 year ago

Alternative: make the edge the center piece of the connection instead of the TT:

    ttg::Edge<int, float> a2b, b2a;
    /* construct without edges */
    auto tta = ttg::make_tt([](int key, float value){ ... });
    auto ttb = ttg::make_tt([](int key, float value){ ... });

    a2b.connect(tta.out<0>(“a2b"), ttb.in<0>(“a2b"));
    b2a.connect(ttb.out<0>(“b2a"), tta.in<0>(“b2a"));

Since edges can have names too we would default to the edge name for the terminals if no explicit name is given.

This really is syntactic sugar based on the initial proposal. It makes it clearer which inputs and outputs are connected.

devreal commented 1 year ago

Inspired by https://github.com/mangpo/floem: do we really need edges? We use them as vehicle to glue to terminals together. A possible alternative could use the stream operators on the terminals directly:

    auto tta = ttg::make_tt([](int key, float value){ ... });
    auto ttb = ttg::make_tt([](int key, float value){ ... });
    tta->in<0>() << ttb->out<0>();
    ttb->in<0>() << tta->out<0>();

When connecting only two terminals the order can be reversed: ttb->in<0>() << tta->out<0>() is semantically equivalent to tta->out<0>() >> ttb->in<0>().

Connecting multiple TT inputs to the first output of tta could be expressed as:

    tta->out<0>() >> ttb->in<0>() >> ttc->in<0>() >> ttd->in<0>();

The stream operator does suggest some form of ordering between the TT inputs, which is not intended here. That may be somewhat misleading?

devreal commented 1 year ago

Notes from the August 31, 2023 meeting

The current state with strong typing of inputs and outputs for improved readability:

    ttg::Edge<int, float> a2b, b2a;
    /* construct without edges */
    auto tta = ttg::make_tt([](int key, float value){}, ttg::input(a2b), ttg::output(b2a),
                            "A", ttg::input_names{"b2a"}, ttg::output_names{"a2b"});

    auto ttb = ttg::make_tt([](int key, float value){}, ttg::input(b2a), ttg::output(a2b),
                            "B", ttg::input_names{"a2b"}, ttg::output_names{"b2a"});

Still somewhat confusing and even more cluttered (IMO).

As @therault suggested, having a way to specify the signature of the task upon creation and providing implementations later would be useful for multi-implementation tasks:

    auto tta = ttg::make_multi_tt(ttg::key<int>{}, ttg::args<float>{}); /* names are placeholders */
    tta->add_impl<ttg::ExecutionSpace::CPU>([](int, float){ /* host implementation here */ });
    tta->add_impl<ttg::ExecutionSpace::CUDA>([](int, float){ /* CUDA implementation here */ });
    tta->add_impl<ttg::ExecutionSpace::HIP>([](int, float){ /* HIP implementation here */ });

This would allow us to use generic arguments again. The question on how to express the connection between TTs is orthogonal.

@evaleev expressed concern that this adds yet another way of doing the same thing (on top of the original make_tt and the TT inheritance interface), plus the existing interfaces on terminals and edges. Maybe some of these can be deprecated or at least discouraged.

therault commented 1 year ago

Alternative to chaining the >> operator:

    tta->out<0>() >> std::make_tuple( ttb->in<0>(), ttc->in<0>(), ttd->in<0>() );
therault commented 1 year ago

Another proposal at the end of the day:

auto tta = ttg::make_tt([=](const key &k, float &&value) {}, "A");
tta.in<0>("value");
tta.out<1>("toB");
tta.out<2>("result");

auto result_tt = make_tt([](const key &k, const float &in) {}, "result_tt");
result_tt.in<0>("in");

/* missing: TT for ttb */

ttg::connect( tta.out("result"), result_tt.in("in") );
devreal commented 1 year ago
tta->out<0>() >> std::make_tuple( ttb->in<0>(), ttc->in<0>(), ttd->in<0>() );

could be shortened to:

tta->out<0>() >> ttb->in<0>(), ttc->in<0>(), ttd->in<0>();

by overloading operator,().

devreal commented 11 months ago

At the last meeting we discussed using ttg::connect as a freestanding function to connect an output terminal to an input terminal. I had started coding something like this:

    auto tta = ttg::make_tt([](int key, float value){});
    auto ttb = ttg::make_tt([](int key, float value){});
    ttg::connect(ttb->out<0>(), tta->in<0>()); // B -> A
    ttg::connect(tta->out<0>(), ttb->in<0>()); // A -> B

The problem is that we know the type of input terminals (from the argument list) but not the type of output terminals (because we have no edges to infer from). I found that there is already a version of ttg::connect<outindex, inindex>(producer, successor). We that we can i) cut out the ugly member calls, and ii) make sure terminals are properly typed:

    auto tta = ttg::make_tt([](int key, float value){});
    auto ttb = ttg::make_tt([](int key, float value){});
    ttg::connect<0, 0>(ttb, tta); // B -> A
    ttg::connect<0, 0>(tta, ttb); // A -> B

We could default the indices to 0 if we wanted. Probably should add the ability to pass names...

Also missing: output terminal fusion. Maybe we can pass a tuple of TTs and array of indices from which to fuse the outputs?

    auto tta = ttg::make_tt([](int key, float value){});
    auto ttb = ttg::make_tt([](int key, float value){});
    auto ttc = ttg::make_tt([](int key, float value){});
    ttg::connect<{0, 0} 0>({ttb, ttc}, tta); // {B|C} -> A
    ttg::connect<0, 0>(tta, ttb); // A -> B

(C++-20 allows arrays as template arguments without specifying their size but that only seems to work with GCC so far)