cs231n / cs231n.github.io

Public facing notes page
MIT License
10.06k stars 4.06k forks source link

2022 assignment3 RNN_Captioning.ipynb [Overfit RNN Captioning Model on Small Data] partly unmatch between description and code #284

Open ghost opened 1 year ago

ghost commented 1 year ago

In Assignment3->Overfit RNN Captioning Model on Small Data part,

there is some words say "Once you have familiarized yourself with the API, run the following to make sure your model overfits a small sample of 100 training examples. You should see a final loss of less than 0.1.", but when I check code below, I found it used 50 examples to implement overfit, so is it sort of contradiction(Description uses 100, now 50)?

I checked all assignment3 from previous version (2017-2022), they have the same problem, and I didn't find any problems from other function like load_coco_data, so I think this is probably a little problem assignment3 itself. By the way, I find model achieves better accuracy use 50 rather than 100.

Code below↓↓↓

small_data = load_coco_data(max_train=50) # <---Here! 50 rather than 100🙂

small_rnn_model = CaptioningRNN(
    cell_type='rnn',
    word_to_idx=data['word_to_idx'],
    input_dim=data['train_features'].shape[1],
    hidden_dim=512,
    wordvec_dim=256,
)

small_rnn_solver = CaptioningSolver(
    small_rnn_model, small_data,
    update_rule='adam',
    num_epochs=50,
    batch_size=25,
    optim_config={
     'learning_rate': 5e-3,
    },
    lr_decay=0.95,
    verbose=True, print_every=10,
)

small_rnn_solver.train()

# Plot the training losses.
plt.plot(small_rnn_solver.loss_history)
plt.xlabel('Iteration')
plt.ylabel('Loss')
plt.title('Training loss history')
plt.show()