Small vs. large dataset meaning in the context of s_decay

optimizedlearning / mechanic

MIT License

35 stars 3 forks source link

Small vs. large dataset meaning in the context of s_decay #5

Open ogencoglu opened 2 weeks ago

ogencoglu commented 2 weeks ago

The option s_decay is a bit like a weight-decay term that empirically is helpful for smaller datasets. We use a default of 0.01 in all our experiments. For larger datasets, smaller values (even 0.0) often worked as well.

What does small and large dataset mean in this context?