dlsyscourse / hw2

5 stars 16 forks source link

Incorrect spec for nn.Linear #14

Closed lordidiot closed 9 months ago

lordidiot commented 9 months ago

The notebook describes the bias shape as (1, out_features)

  • bias - the learnable bias of shape (1, out_features).

However, in the tests, the bias is manually set as so:

def linear_forward(lhs_shape, rhs_shape):
    ...
    f.bias.data = get_tensor(lhs_shape[-1])

def linear_backward(lhs_shape, rhs_shape):
    ...
    f.bias.data = get_tensor(lhs_shape[-1])

get_tensor(x) will return a tensor with shape (x,) not (1, x), so there is a shape mismatch.

Not sure which is meant to be the correct shape, but I'd assume the shape should be (out_features,) to match the tests. Although, this line seemed to have been edited recently, so I'm not sure if I've got this wrong.

lordidiot commented 9 months ago

duplicate of https://github.com/dlsyscourse/hw2/issues/8