dvgodoy / PyTorchStepByStep

Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"
https://pytorchstepbystep.com
MIT License
834 stars 310 forks source link

Chapter 09, encoder-decoder Data Preparation test_points not used for test set #25

Closed gsgxnet closed 2 years ago

gsgxnet commented 2 years ago

In the chunk for generation of the test set (Data Generation — Test) the full_testis derived from the points data structure, which are used for training, not from the test_points.

test_points, test_directions = generate_sequences(seed=19)
full_test = torch.as_tensor(points).float()
source_test = full_test[:, :2]
target_test = full_test[:, 2:]

I do not think that is intended, so there is a simple correction possible:

full_test = torch.as_tensor(test_points).float()

Based on that change we get different performance figures.
Loss: plot_loss

and another figures prediction:

seq_pred

with 8 of 10 sequences with "clashing" points.
If my results are right, this text chunk needs some adaption as well:

The results are, at the same time, very good and very bad. In half of the sequences, the predicted coordinates are quite close to the actual ones. But, in the other half, the predicted coordinates are overlapping with each other and close to the midpoint between the actual coordinates.

For whatever reason, the model learned to make good predictions whenever the first corner is on the right edge of the square, but really bad ones otherwise.

See sequence pictures, these statements needs to be adapted. Especially the second.

Same issue can be found in the final putting it all together section:

# Validation/Test Set
test_points, test_directions = generate_sequences(seed=19)
full_test = torch.as_tensor(points).float()
source_test = full_test[:, :2]
target_test = full_test[:, 2:]
test_data = TensorDataset(source_test, target_test)
test_loader = DataLoader(test_data, batch_size=16)

All based on your 1.1 revision, if I did not make any mistakes in updating by git pull.

gsgxnet commented 2 years ago

The differences caused by the needed change in the full_testdata result in significant different results also in the

Encoder + Decoder + PE

section.
I see now this loss graph (when running the code with the independent test_point data):

Encoder + Decoder + PE loss

quite different from the original (where the test set is filled with the same points as the training set):

image

and now in the updated test setup, we have one clashing points sequence:

image

please compare to the sequences based on the original code:

image

doubling the size of the training set from 128 to 256 sequences, will give results nearer to the expectation:
points, directions = generate_sequences(n=256, seed=13) (has to be changed at several places)

image

and the selected 10 validation sequences are good (depending on seed, this was run with 13):

image

dvgodoy commented 2 years ago

Thank you so much for pointing this out!

You're absolutely correct - it should be: full_test = torch.as_tensor(test_points).float()

I will fix this and update the text to reflect the changes.

Thanks for supporting my work and helping to improve it :-)

Best, Daniel

dvgodoy commented 2 years ago

Hi @gsgxnet,

I've updated code, figures, and text in the book, and published the revised edition (v1.1.1) today :-)

Once again, thanks for pointing this out.

Best, Daniel