MNIST normalization error

yt7589 commented 2 years ago

I failed to pass the last test case of parse_mnist. My environment: windows 10, anaconada3, python 3.8 My codes as below: My code snip 1:

(removed)

My code snip 2:

(removed)

They all reported the same error message as below:

    def test_parse_mnist():
        X,y = parse_mnist("data/train-images-idx3-ubyte.gz",
                          "data/train-labels-idx1-ubyte.gz")
        assert X.dtype == np.float32
        assert y.dtype == np.uint8
        assert X.shape == (60000,784)
        assert y.shape == (60000,)
        np.testing.assert_allclose(np.linalg.norm(X[:10]), 27.892084)
>       np.testing.assert_allclose(np.linalg.norm(X[:1000]), 293.0717,
        #np.testing.assert_allclose(np.linalg.norm(X[:1000]), 293.071838,
            err_msg="""If you failed this test but not the previous one,
            you are probably normalizing incorrectly. You should normalize
            w.r.t. the whole dataset, _not_ individual images.""")
E       AssertionError:
E       Not equal to tolerance rtol=1e-07, atol=0
E       If you failed this test but not the previous one,
E               you are probably normalizing incorrectly. You should normalize
E               w.r.t. the whole dataset, _not_ individual images.
E       Mismatched elements: 1 / 1 (100%)
E       Max absolute difference: 0.00013838
E       Max relative difference: 4.72167412e-07
E        x: array(293.07184, dtype=float32)
E        y: array(293.0717)

tests\test_simple_ml.py:40: AssertionError

My question is how to do the normalization?

By the way I am in china mainland. I failed to register to this course. I filled the enroll form without much difficulty. But when I login this course it said that my email is not exist. I hadn't received any email via my email address. How to register to this class in Chain mainland?

Leonard-Zeng commented 2 years ago

I am having the same issue on the local machine, but my code passes the test after uploading to Google Colab.

yt7589 commented 2 years ago

I had found that if I changed the assertion to:

np.testing.assert_allclose(np.linalg.norm(X[:1000]), 293.0717, rtol=1e-06, atol=0,
        err_msg="""If you failed this test but not the previous one,
        you are probably normalizing incorrectly. You should normalize
        w.r.t. the whole dataset, _not_ individual images.""")

With loosing the standard I can pass this test case. Is the test case assertion condition too rigid with regards to different hardware and software platform? I found that other test cases such as softmax_loss they all had the same phenomenon.

ashertrockman commented 2 years ago

@yt7589 I'm not entirely sure what the reason is for this discrepancy, but your code looks fine to me. It's probably okay to proceed with the rest of the assignment(s) despite this. I'll let you know if we see that these sorts of discrepancies are causing problems down the line, but I suspect it will be okay / you can simply ignore the local tests or change them as you see fit. The actual autograding system generally uses a higher tolerance anyways.

In the future, please don't post parts of your solutions publicly.

Re: email doesn't exist -- we haven't yet made mugrade accounts for students in the online course, but will do so in the near future. (So this problem is on our end, not yours.)

ashertrockman commented 2 years ago

@yt7589 @Leonard-Zeng I think I actually just made this test too strict, such that it fails on some platforms and not others. I've pushed a change increasing the tolerance (exactly what you did).

dlsyscourse / hw0

MNIST normalization error #4