ebetica / autogradpp

Direct C++ Interface to PyTorch
MIT License
80 stars 12 forks source link

Forward takes Variant class #70

Open ebetica opened 6 years ago

ebetica commented 6 years ago

One of the bigger refactors, we introduce a variant class to wrap inputs and outputs in, so you at least get run time type checking of your arguments, much like in Python. One of the things we do here is that the Variant will automatically specialize into a Variable is the function calls for it. I took the stance that if it appears in Tensor/variable.h but not Functions.h, then we should allow the function to be called. I haven't added all the function in yet.

Additionally, it introduces method changing via template magic.

Things that will work:

variant
  .getList()[0]
  .m(at::relu)
  .m(at::mse_loss, other);

Please comment on whether this is better, worse, easier, or harder to use. In general, I noticed that we had to write a little more boilerplate but it makes more intuitive sense to me.

ebetica commented 6 years ago

Here's the main API for the lazy: https://github.com/ebetica/autogradpp/pull/70/files#diff-ffe21efa98aef140628a9045de88f753R47

ebetica commented 6 years ago

Sounds like you guys are happy with this? I should go an implement all the other missing Tensor functions that are not also in Functions.h, and write some tests on what the usage is.

goldsborough commented 6 years ago

Cool stuff! I have some concerns though. It seems that with this change you are moving all your type safety that C++ provides at compile time, to runtime. This makes for simpler code, like Python, but it also makes the code much less safe, no? Compile time type safety being one of the main advantages provided by C++. It also seems to make the code less expressive. For example this signature: variable_list foo(Tensor x, float f) now becomes Variant foo(Variant, Variant). Don't you feel we lost something here? Apart from this, if you mix up the two arguments, you wouldn't know until runtime. What if it only fails after 3 days of training? Isn't that exactly what we're trying to avoid by using C++ instead of Python?

I'm concerned because I am doing my best to make Pytorch's C++ API match your amazing existing work in autogradpp (on the functional level), so that hopefully we can soon move you guys to the official API soon (and thus provide you with much more support, and reduce the amount of effort you have to put in working around current issues in the autograd engine!). At the same time, I'm trying to design our C++ API in a way that will appeal to the wider C++ audience. I'm a bit worried about this change because I don't think our C++ users will be comfortable with an API that removes compile time type safety.

I'm sorry I didn't comment on this earlier (I'm on PTO). Can you provide me with some more context on how this change makes your code easier (just to understand)?

jgehring commented 6 years ago

Thanks for weighing in here, @goldsborough. Here's my take on this, @ebetica can maybe elaborate a bit more.

Compile time type safety being one of the main advantages provided by C++

For us, the main argument for using C++ here is performance and the ability to efficiently use and manipulate low-level data in addition to doing tensor and NN logic.

For example this signature: variable_list foo(Tensor x, float f) now becomes Variant foo(Variant, Variant).

As far as I understand, the variant is mostly useful for input and output arguments to forward(). We want to have it virtual, and it should not impose too many constraints on what data we can and can't pass between containers.

I'm a bit worried about this change because I don't think our C++ users will be comfortable with an API that removes compile time type safety.

Losing compile-time type-checking arguably starts with at::Tensor, no? IMHO the limited amount of compile-time checking for this is a feature rather than a hindrance :)

Can you provide me with some more context on how this change makes your code easier (just to understand)?

For me, this will be useful because I can check the inputs and outputs of user-defined models in a more robust and more expressive way (at run-time, of course). I couldn't check it at compile-time anyway (I just have ag::variable_list right now...). I admit I don't have a concrete use-case for the string and unordered_map types.

ebetica commented 6 years ago

It's always a battle between how much compile time safety we have and how much flexibility we allow our users. I wish C++ could somehow statically check the types of your model outputs, but afaik, we pass around Containers, and you can't really do virtual dispatch on functions of different return types.

The other option I think you reminded me of is that we could force our users to subclass and define the types of outputs they expect. Then, we cannot have a generic Container class with a forward, but we would get compile time safety in return. It may not actually be that much more implementation now that I think about it. All that happens is that people will have to explicitly declare the types of the containers they wish to use, and there will be no virtual dispatch, you cannot specify a general forward operator, etc.

ebetica commented 6 years ago

The unordered_map is optional dictionary of returns, like

{"policy_output": Tensor.sizes() = [8], "value_output": Tensor.sizes() = [1], "policy_embeddings": Tensor.sizes() = [8, 64], ...}

ebetica commented 6 years ago

@goldsborough @jgehring I think the crux of the issue is that our Containers are

variable_list forward(variable_list) = 0;

So we already have no compile time guarantees. This PR makes it easier to do things at run time. But, what if we just removed virtual forwards, and force users to define their own input-output contracts. Then we get compile time safety. We could even do something where container is templated on the output type:

template <typename O, typname Args...>
Container {
  O forward(Args&& params...) = 0;
  ...
}
ebetica commented 6 years ago

@syhw notes that we can have both, variants for the daring and templates for the strong of will.

jgehring commented 6 years ago

So we already have no compile time guarantees. This PR makes it easier to do things at run time.

Yes, that was my point more or less. I don't see further options for compile-time type safety other then descending to template hell (there's a reason we love using ATen), or adding some mechanism for stitching models together in a type-safe way (back to graph-based networks?)...

ezyang commented 6 years ago

@ebetica and I had a discussion and we think we can have the best of both worlds. Essentially, we will take this PR, but add a little bit of extra convention to it:

// Counter-counter proposal
class Module {
  virtual Variant forward(Variant);
}
class MyContainer : Module {
  virtual Variable forward(int, Variable) {
    ...your actual, well typed code...
  }
  Variant forward(Variant v) override {
    return Variant(forward(v.getInt(0), v.get(1)));
  }
}

So if you know you have a MyContainer or subclass thereof, you can call the well-typed forward; otherwise, you can call the dynamically typed interface to implement a polymorphic trainer or something similar.

ebetica commented 6 years ago

@jgehring What are best practices regarding move and copy constructors? I defined default (good for maps and stuff), copy, move, and move assignment.

jgehring commented 6 years ago

Looks good; for the sake of completeness you might want to declare a default assignment operator although that's done by the compiler automatically anyway.

goldsborough commented 6 years ago

I don't really understand the argument that variable_list forward(variable_list) is not typesafe. It allows exclusively a list of Variables, which is 1 type. Here we're proposing a very slightly less bad thing than void*. Like, you can't pass every type in the world like with void*, but 9 different types and even a whole recursive hierarchy of these 9 different types.

@ezyang's idea is nice because it really allows you to define your forward with the concrete types (at least the forward in which you implement your code), but I much dislike putting more boilerplate on users (in having to destructure the variant in the function that calls the typed forward).

Thought: it seems to me the original goal you had was to have a variable number of arguments, each of which can either be a variable or a scalar, right? I totally understand that this is useful. So what if we limit the Variant to be variant<Variable, Scalar> and wrap it in a way that it has variant.scalar(<index>) and variant.tensor(<index>) accessors?

jgehring commented 6 years ago

but I much dislike putting more boilerplate on users (in having to destructure the variant in the function that calls the typed forward)

I'm pretty sure we could do some template-fu to provide a Variant -> Variant version automatically. Thanks to CRTP we know the exact types of a user-defined forward with explicit types at compile-time.

ebetica commented 6 years ago

Yeah this is totally possible I think.

template <typename T>
class Container_CRTP {
  Variant unchecked(Variant x) override {
    static_cast<T*>(this)->checked(magic_unpack(x));
  }
}

Unfortunately, the template-fu in my quick Google search seems very mysterious and perhaps difficult to write. Maybe @goldsborough wants to have a shot at it? :D I'm also not sure what the best naming scheme is. The "problem" is that the checked version cannot be defined in ContainerImpl. Only the unchecked version can be virtually defined. I think some variant of [checked, operator()] or [operator(), unchecked] seems reasonable.

goldsborough commented 6 years ago
template<typename T>
struct function_traits;

template<typename C, typename R, typename... Args>
struct function_traits<R(C::*)(Args...)> {
  static constexpr size_t number_of_arguments = sizeof...(Args);

  using return_type = R;
  using argument_tuple = std::tuple<Args...>;

  template<size_t n>
  using argument_type = typename std::tuple_element<n, argument_tuple>::type;
};

struct Variant {
  template<typename T>
  Variant(T) {}

  template<typename T>
  T get(size_t index) { return T(); }
};

#define TORCH_FORWARD(qualified_forward_method) \
  template<size_t... Is> \
  Variant variant_forward(Variant v, std::index_sequence<Is...>) { \
    return forward(v.get<function_traits<decltype(&qualified_forward_method)>::argument_type<Is>>(Is)...); \
  } \
  Variant variant_forward(Variant v) override { \
    return variant_forward(v, std::make_index_sequence<function_traits<decltype(&qualified_forward_method)>::number_of_arguments>{}); \
  }

struct Module {
  virtual ~Module() = default;
  virtual Variant variant_forward(Variant v) = 0;
};

struct SubModule : public Module {
  int forward(double x, std::string s) {
    return x;
  }

  TORCH_FORWARD(SubModule::forward)
};

blows revolver