When computing the cross-entropy loss, nan is produced because gradients are lost in the update calculation.
So far, all parts were checked and are in-flow, so not entirely sure where the nilloss is being produced.
Parts creating error: running forward pass on model(...) and loss_function(...).
However, model initializes properly using Xavier initialization in the NN class.
Need fix for modularity.
When computing the cross-entropy loss, nan is produced because gradients are lost in the update calculation. So far, all parts were checked and are in-flow, so not entirely sure where the nilloss is being produced. Parts creating error: running forward pass on model(...) and loss_function(...). However, model initializes properly using Xavier initialization in the NN class. Need fix for modularity.