Open xnuohz opened 1 year ago
I also came across this problem and I may have a clue why this happen.
First, Numpy will apply type promotion to decide the result type. The rules can be found here.
Second, the sum
function in Python implicitly set start=0
. So we are actually trying to compute node_grads[0] + 0
here. By using np.result_type
function, we can reveal the result type
(Pdb) node_grads[0].dtype
dtype('float32')
(Pdb) np.result_type(node_grads[0] + 0)
dtype('float64')
I also found that Numpy's promotion rules sometimes make my scalar ops(i.e. AddScalar
, DivScalar
, etc) produce np.float64
types.
I also found that Numpy's promotion rules sometimes make my scalar ops(i.e.
AddScalar
,DivScalar
, etc) producenp.float64
types.
That's true. In softmaxloss computation, I use code snipet like:
# DivScalar
return batch_res.sum() / batch_num
can produce np.float64
type.
So how to produce np.float32
type in above cases? I cannot fully understand the Numpy's promotion rules...
I also found that Numpy's promotion rules sometimes make my scalar ops(i.e.
AddScalar
,DivScalar
, etc) producenp.float64
types.That's true. In softmaxloss computation, I use code snipet like:
# DivScalar return batch_res.sum() / batch_num
can produce
np.float64
type. So how to producenp.float32
type in above cases? I cannot fully understand the Numpy's promotion rules...
The divScalar function can be implemented by explicitly calling np.true_divide
, which supports the keyword dtype
to specify the return type.
I met the following error when testing sgd
Then I found 1 line in the function
compute_gradient_of_variables
will cause this errorI change it and things go right
The following dtype in pdb is wired. Maybe I was wrong.