dlsyscourse / hw0

14 stars 23 forks source link

Different implementation of softmax_loss #5

Closed MARD1NO closed 1 year ago

MARD1NO commented 1 year ago

(removed answer)

ashertrockman commented 1 year ago

For both implementations, I recommend applying log to the exps analytically when possible (otherwise there can be some numerical trouble).

In the future, please don't post your solutions publicly when asking questions (try to ask the question without posting the solution). We also prefer you post on the class forum if possible.