Investigate Tree to Sequence ANC Model Convergence

The tree to sequence anc model currently decreases its loss by a slight amount from around 8ish to 4/5ish. One potential thing to look at is the values/gradients of operations that can saturate like softmax/sigmoid. I'd also think about hyperparameters and initializations (especially the initialization related to the anc). It may be helpful to examine the neural assembly the model generates after it trains. The current loss examines 10 input/output pairs per program. This itself is a hyperparameter to examine. This is the highest priority current task.

hmc-cs-mdrissi / neural_nets_research

Investigate Tree to Sequence ANC Model Convergence #5