bclarkson-code / Tricycle

Autograd to GPT-2 completely from scratch
104 stars 7 forks source link

Code versions as a hyperparameter #38

Closed bclarkson-code closed 4 months ago

bclarkson-code commented 5 months ago

It is common practice to record model hyperparameters when experimenting. But I think this misses a lot of extra information. To experiment properly, I think that one should also record specific implementations of different components. you could simply record the git commit that was used, but I think it would be more helpful to record the specific version of each component used.

I think that a good place to start would be recording a hash (or similar) for each layer in the model.

bclarkson-code commented 4 months ago

Closing for now as it is off topic