C10 Tracking PR - Githubissues

This PR is a synthetic tracking PR to let you view the C10 diff from GitHub PR and make comments without committing to the repository.

IF YOU WANT TO EDIT, GO AHEAD AND EDIT DIRECTLY VIA THE GITHUB UI. BE BOLD.

General structure as of 4/12/18.

Public include header is #include <c10/c10.h>
AlignOf, SmallVector, ArrayRef, Optional are direct copies from their versions in ATen, you do not need to review them. Assert is a stub file and will be replaced with a more fully-featured version later.
Tensor is the public facing Tensor API. The public API is intended to track ATen as closely as possible.
guts/ contains internal non-backend specific implementation details that are not part of the user-visible API (but might be part of the backend extension API.) Retainable is adapted from ATen's copy but with some more substantive changes to make it more idiomatic C++. TensorImpl is the data layout for tensor metadata.
op/ contains generic op implementations, which are backend invariant
cpu/ contains the CPU backend. CPUTensorImpl the implementation of TensorImpl, and then helper classes CPUAllocator (the globally swappable allocator), CPUStorage the PRIVATE storage concept allowing multiple tensors to view the same storage, and CPUContext (the global context for CPU backend.) The op/ contains CPU-specific op implementations.
The following files are heavily under construction: TypeId (wholly insufficient for real dynamic dispatch and will probably be rewritten)

Known issues:

[ ] There is no actual dynamic dispatch; everything is hardcoded to dispatch to CPU. All sites where this occurs are marked accordingly
[ ] cpu is not separately compilable; it will be at some further point
[ ] Destructors are using std::function which is bad. Make them stop using that.
[ ] Assert.h needs to be fleshed out into a real thing, bringing in Caffe2 affordances
[ ] Abstraction barrier between "ops" and "TensorImpl methods" is not fully fleshed out

Issues for discussion:

[ ] realloc from CPU allocator? (Torch has it, Caffe2 does not, currently we don't have it)
[ ] FLAGS_caffe2_keep_on_shrink and FLAGS_caffe2_max_keep_on_shrink_memory (Caffe2-only, we have implemented this)
[ ] Reopening the problem of keeping an appropriate Context in every Tensor (make it possible to have two copies of c10 in process)
[ ] Does resize keep or destroy old data? (Torch keeps, Caffe2 destroys; currently we have a boolean flag to toggle the behaviors)
[ ] Caffe2 lazily constructs Tensor data on first mutable_data call (maybe the caffe2::Tensor wrapper can simulate this behavior)
[ ] Should number of elements be cached? (Torch doesn't, Caffe2 does)
[ ] Should undefined tensor be a thing? (ATen has it, Caffe2 doesn't???)
[ ] Should we continue only allowing 'double' in arguments to dispatchable functions?

CC @zdevito @colesbury @dzhulgakov @smessmer @ajtulloch

ezyang / pytorch-unattached