bclarkson-code / Tricycle

Autograd to GPT-2 completely from scratch
104 stars 9 forks source link

Get CrossEntropy working #2

Closed bclarkson-code closed 7 months ago

bclarkson-code commented 8 months ago

The crossentropy loss function does not currently work as expected and its test fails. The reason should be investigated and fixed

bclarkson-code commented 7 months ago

After several rewrites, it was discovered that vectorising was being done too soon