Closed dan-zheng closed 5 years ago
Things may get tricky if it's necessary to generate non-__host__ __device__
lambdas.
I wonder if it's somehow possible to annotate instances Lambda
as being "GPU" lambdas.
One unideal solution is to create "GPU" versions of Lambda
and fun
. This is would be an unacceptable amount of code dupe, however.
I discussed with @feiwang3311 and we agreed that changing Lambda
codegen is too dangerous.
I'll reimplement things (in a more ad-hoc way) to avoid changing Lambda
codegen.
Reimplemented ops without changing Lambda
codegen in https://github.com/feiwang3311/Lantern/pull/30/commits/71876be7cb5206177b522dbb6e03a80d761878a7.
There's hacky logic for propagating Rep[Float]
arguments to an unchecked
call: namely, the op
argument to launchUnaryKernel
has type String => Seq[Any]
.
Otherwise, elementwise ops work as intended. Ready for review.
Elementwise tensor-tensor and tensor-scalar ops all work. TODO: implement broadcasting between tensors of different ranks.