rkneusel9 / PracticalDeepLearningPython

Source code for the book "Practical Deep Learning: A Python-Based Introduction" (No Starch Press)
MIT License
87 stars 49 forks source link

Code bug report. #4

Closed ReneElizondo closed 1 month ago

ReneElizondo commented 2 months ago

On listing 4-1 code to construct the subsets using a 90/5/5 split of the original data (page 71).

❹ 
n0 = int(x0.shape[0]-ntrn0)
n1 = int(x1.shape[0]-ntrn1)
xval = np.zeros((int(n0/2+n1/2),20))
yval = np.zeros(int(n0/2+n1/2))
xval[:(n0//2)] = x0[ntrn0:(ntrn0+n0//2)]
xval[(n0//2):] = x1[ntrn1:(ntrn1+n1//2)]
yval[:(n0//2)] = y0[ntrn0:(ntrn0+n0//2)]
yval[(n0//2):] = y1[ntrn1:(ntrn1+n1//2)]

Lets assume for now that (for whatever reason) you end up with n0 = 5 and n1 = 5. If such was the case you wilt end up with something like this (replacing variables with actual values):

n0 = 5
n1 = 5
xval = np.zeros(5,20)
yval = np.zeros(5)
xval[:2] = x0[ntrn0:(ntrn0+2)]
xval[2:] = x1[ntrn1:(ntrn1+2)]  => Crash here.

The reason why the code will crash is because the last line in the sample code above is looking for a range of [2:], which really translates to [2:3]. When that is the case, (ntrn1+2) should be (ntrn1+3). Basically you are being inconsistent in the way you truncate the values throughout the procedure.

The end result is that you end up with a similar error as the one shown below.

image

Thanks.

rkneusel9 commented 1 month ago

The code is pedagogical, not production level. If your dataset is that small, use k-fold cross validation.

ReneElizondo commented 1 month ago

Understood, thank you for taking the time to answer.