buriburisuri / sugartensor

A slim tensorflow wrapper that provides syntactic sugar for tensor variables. This library will be helpful for practical deep learning researchers not beginners.
MIT License
372 stars 63 forks source link

inconsistency between sg_ce and sg_train #16

Closed AndreasMadsen closed 7 years ago

AndreasMadsen commented 7 years ago

sg_ce says it returns:

"A 1-D Tensor with the same shape as tensor".

(_actually it return an n-D Tensor that has shape tensor.get_shape()[:-1]_.)

sg_train says in takes an:

A 0-D Tensor containing the value to minimize.

but somehow this supports an n-D tensor. See for example your ByteNet implementation: https://github.com/buriburisuri/ByteNet/blob/master/train.py#L103L106 sg_train also calls np.mean internally: https://github.com/buriburisuri/sugartensor/blob/master/sugartensor/sg_train.py#L339


I'm not really sure that the intended behaviour is, but I would definitely like if sg_train continued to able to take a scalar value.

AndreasMadsen commented 7 years ago

This is solved after implementing tower support.