nuclear-multimessenger-astronomy / nmma

A pythonic library for probing nuclear physics and cosmology with multimessenger analysis
https://nuclear-multimessenger-astronomy.github.io/nmma/
GNU General Public License v3.0
33 stars 58 forks source link

Remove additional NN layer between input and wide layer? #151

Closed bfhealy closed 1 year ago

bfhealy commented 1 year ago

Between our NN's input layer and the wide (2048 neurons) layer, there is currently another layer that has the same shape as the input layer. I previously missed that this was an additional hidden layer, or I would have removed it:

https://github.com/nuclear-multimessenger-astronomy/nmma/blob/03d51ac48a80d69f54e72014a183e66e2c6a175b/nmma/em/training.py#L390

Training the same model using this updated architecture versus the current one seems to offer slight performance improvements, especially at early times; see the attached collapsar model runs comparing the current (top) and new (bottom) NN architectures. I can't make an argument for keeping this additional hidden layer (except the need to re-train any tf models and propagate this change through to existing results).

Current: injection_AnBa2022_lightcurves_oldNN

New: injection_AnBa2022_lightcurves_newNN

mcoughlin commented 1 year ago

@bfhealy maybe also we should get in the habit of running the svdmodel_benchmark and comparing those? But yeah, it seems like we should upgrade it is better.

bfhealy commented 1 year ago

@mcoughlin Agreed about the benchmarks. I'll go ahead open a PR removing that layer from the NN.