PhilipQuirke / quanta_maths

Tool used to verify accuracy of transformer model
Apache License 2.0
1 stars 1 forks source link

MATH: Is a 99.9999% accurate 1-layer addition model possible? #37

Closed PhilipQuirke closed 2 weeks ago

PhilipQuirke commented 6 months ago

Our 2-layer addition model achieved 99.9999% accuracy using the TriCase/TriAdd approach. Our 1-layer addition model only achieved 99% accuracy. It never learnt the TriCase/TriAdd approach. But the 2-layer model only uses the attention heads in one layer. Perhaps 1-layer addition model cant learn TriCase/TriAdd but perhaps the accurate 2-layer model can be pruned and retrained to give an accurate 1-layer addition model.

This ticket covers:

PhilipQuirke commented 2 weeks ago

Done