eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
937 stars 95 forks source link

Hardcode number and type of op outputs #4

Closed eaplatanios closed 7 years ago

eaplatanios commented 7 years ago

It would be nice for the op outputs to be strongly-typed (e.g., an op returns an INT32 output and a FLOAT32 output). Using an HList might be necessary for that. However, the problem lies with ops that have a variable number of inputs/outputs that are only known at runtime. This means it may be impossible to do this.

One example is an op that returns two lists of tensors. In the C API that would be represented as a single list of tensors that is the concatenation of the two lists. In this case, we would like to have an HList with two elements, each being a list of tensors of the appropriate data type. However, that would be impossible to resolve at compile-time since the number of tensors in each list may be unknown at that time. The Unpack op, for example, returns a list of tensors of unknown length.

sbrunk commented 7 years ago

For reference: Documentation of the C++ Op interface: https://www.tensorflow.org/extend/adding_an_op#building_advanced_features_into_your_op

sbrunk commented 7 years ago

I've done some thinking about typed Output as well as some tinkering. The result is a draft table of how the types could look like for different cases.

I'm assuming DataType as it's currenty implemented, with ScalaType as type member and subclassing objects fixing the type. I'm also assuming that each Output[T] will produce a single Tensor[T] when evaluated. This is how the types roughly look like:

sealed trait Output[T <: DataType] {
  def evaluate(...): Tensor[T]
  ...
}

class INT32Output extends Output[INT32] {
  override def evaluate(...): Tensor[INT32] = ...
  ...
}

sealed trait Tensor[T <: DataType] {
  def entriesIterator: Iterator[T#ScalaType]
  ...
}

class INT32Tensor extends Tensor[INT32] {
  def entriesIterator: Iterator[Int] = ...
  ...
}

Based on those types here are the possible cases I could think of with examples. Could you have a look to check if it's reasonable or if I missed something?

eaplatanios commented 7 years ago

The main issue is not with each output per se, but rather with the representation of the outputs field of the Op class. For example, in the case of (Seq[Output[INT32]], Seq[Output[FLOAT32]]) that you present, what would be the return type of the def output(index: Int) function?

eaplatanios commented 7 years ago

I guess in this case we can avoid having that function altogether and just have an outputs field that returns an HList? The type signature of that HList should parameterized the Op class in that case (e.g., Op[Seq[Output[INT32] :: Seq[Output[FLOAT32]]]). Does that sound reasonable?

In this case, we should also probably do the same for the inputs of the op and have a second type parameter for them.

eaplatanios commented 7 years ago

On second thought though, a problem remains. The C API allows obtaining outputs through their index and even if we want to have two dynamically sized sequences, we still have to find a way to map the index on each sequence to the underlying flattened sequence index. This requires knowing the sequence lengths which can only be done at runtime and potentially causes some more complications I'm not thinking of right now.

sbrunk commented 7 years ago

Do we even have to make Op parameterized? I thought it might be easier to keep it as is, and only have a parameterized Output i.e. TypedOutput[T <: DataType] in addition to the existing one. That way, we could avoid having to abstract over the combined types of an op's outputs.

Then in the op definition methods we cast/map the output(s) returned from the native call as needed, exposing only the typed version where possible. In those methods we should also have all the information to map the flattened sequence to more complex structures if needed (I haven't actually found any C++ Op implementation returning multiple lists yet).

I.e. for a simple polymorphic math operation it would look like this:

def add[T <: DataType](x: TypedOutput[T], y: TypedOutput[T], name: String = "Add"): TypedOutput[T] =
  Op.Builder(opType = "Add", name = name)
      .addInput(x)
      .addInput(y)
      .build().outputs(0).toTypedOutput

Does that make sense?

sbrunk commented 7 years ago

I went ahead and created a branch with a parameterized Output[+T <: DataType] to see if it could work. I still need to do some cleanup but I hope I'll be able to create a PR this week as a basis for further discussion.

eaplatanios commented 7 years ago

@sbrunk That sounds good, thanks! I have various issues in my mind that came up when I looked into this so I guess if you have a draft implementation, pointing them out on top of it would make this conversation more concrete. By the way, I'm back from traveling and so I'm becoming more active again. :)

sbrunk commented 7 years ago

I've created a PR with my draft implementation in #14.

It makes OutputLike and its implementations polymorphic, parameterized with a subtype of DataType. I.e. OutputLike[+T <: DataType].

Based on that, I started to make the op implementations more strongly typed. I.e. accepting only Output[NumericDataType] for some ops and making sure the types of multiple inputs or input/output types are the same at compile time.

I'll add a list of open issues and questions I ran into soon.

eaplatanios commented 7 years ago

@sbrunk Sorry I missed this message and responded directly to the pull request.

eaplatanios commented 7 years ago

I'll close this for now since we decided to look into typed tensors first and we can keep the discussion open in the related project I created.

Datatype
Static Parameterized Parameterized (restricted) Dynamic (runtime)
Output Size / Shape Single Output[INT32] [T <: DataType] Output[T] [T: IsIntOrFloat] Output[T]) UntypedOutput
Multiple fixed size (Output[INT32], Output[FLOAT32]) [T <: DataType, U <: DataType](Output[T], Output[U]) [T: IsIntOrFloat, U: IsFloatingPoint](Output[T], Output[U])) (UntypedOutput, UntypedOutput)
Multiple dynamic size (homogeneous) Seq[Output[INT32]] [T <: DataType] Seq[Output[T]] [T: IsIntOrFloat] Seq[Output[T]] Seq[UntypedOutput]
Multiple dynamic size (heterogeneous) - - - Seq[UntypedOutput]
Combined fixed size and homogeneous dynamic (Seq[Output[INT32]], Seq[Output[FLOAT32]]) [T <: DataType, U <: DataType](Seq[Output[T]], Seq[Output[U]]) [T: IsIntOrFloat, U: IsFloatingPoint](Seq[Output[T]], Seq[Output[U]]) (Seq[UntypedOutput], Seq[UntypedOutput])
Combined fixed size and heterogeneous dynamic - - - (Seq[UntypedOutput], Seq[UntypedOutput])