dmccloskey / EvoNet

MIT License
2 stars 0 forks source link

GPU compliant Tensor operation classes #79

Closed dmccloskey closed 5 years ago

dmccloskey commented 5 years ago

Description

A number of implementations that work fine when using CPU code break when moving to GPU code as implemented in Cuda.

Objectives:

Validation

dmccloskey commented 5 years ago

See custom implementation of clip operator https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/eigen_activations.h

/**
  * \ingroup CXX11_NeuralNetworks_Module
  * \brief Template functor to clip the the magnitude of the first scalar.
  *
  * \sa class CwiseBinaryOp, MatrixBase::Clip
  */
template <typename Scalar>
struct scalar_clip_op {
  EIGEN_EMPTY_STRUCT_CTOR(scalar_clip_op)
  EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const Scalar
  operator()(const Scalar& a, const Scalar& b) const {
    return numext::mini(numext::maxi(a, -b), b);
  }
  template <typename Packet>
  EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const Packet
  packetOp(const Packet& a, const Packet& b) const {
    return internal::pmin(internal::pmax(a, internal::pnegate(b)), b);
  }
};

namespace internal {
template <typename Scalar>
struct functor_traits<scalar_clip_op<Scalar> > {
  enum {
    Cost = NumTraits<Scalar>::AddCost * 3,
    PacketAccess = packet_traits<Scalar>::HasMax &&
                   packet_traits<Scalar>::HasMin &&
                   packet_traits<Scalar>::HasNegate
  };
};
}  // namespace internal

}  // end namespace Eigen

#endif  // EIGEN_CXX11_NEURAL_NETWORKS_ACTIVATIONS_H
dmccloskey commented 5 years ago

See spatial convolution example https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/eigen_spatial_convolutions.h and its implementation https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/conv_2d.h

pad, inflate, and shuffle:

template <typename Device, typename T, int Dims, typename IndexType>
struct InflatePadAndShuffle {
  void operator()(
      const Device& d, typename TTypes<T, Dims, IndexType>::ConstTensor input,
      const Eigen::DSizes<IndexType, Dims>& strides,
      const Eigen::array<Eigen::IndexPair<IndexType>, Dims>& pad_dims,
      const Eigen::DSizes<IndexType, Dims>& order,
      typename TTypes<T, Dims, IndexType>::Tensor output) {
    output.device(d) = input.inflate(strides).pad(pad_dims).shuffle(order);
  }
};

Spatial convolution call

template <typename Device, typename Input, typename Filter, typename Output>
void SpatialConvolutionFunc(const Device& d, Output output, Input input,
                            Filter filter, int row_stride, int col_stride,
                            int row_dilation, int col_dilation,
                            const Eigen::PaddingType& padding) {
  // Need to swap row/col when calling Eigen.
  output.device(d) =
      Eigen::SpatialConvolution(input, filter, col_stride, row_stride, padding,
                                col_dilation, row_dilation);
}
dmccloskey commented 5 years ago

See softmax example https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/eigen_softmax.h

dmccloskey commented 5 years ago

See pooling example https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/eigen_pooling.h

dmccloskey commented 5 years ago

Alternative Clipping using built-in .max and .min calls: https://forum.kde.org/viewtopic.php?f=74&t=117349