Poor tensorflow training results for new model grid

bfhealy commented 10 months ago

Posting this issue to further document efforts by @shreyasahasram08, @tsunhopang, @ThibeauWouters, and myself to achieve better training results for a new Bu2023Ye grid (see #292). The main difference from Bu2022Ye is that the new grid allows Yewind to take values of 0.2, 0.3, and 0.4, while the Bu2022Ye fixed the parameter at 0.3.

We are performing the following tests:

Ensuring parameter parsing from grid filenames works as intended
Running training only for grid files with Yewind = 0.3, which should replicate Bu2022Ye results

We are also exploring multiple areas of potential improvement, including:

Whether values of Yewind parameter are too widely spaced compared to the finer spacing of other parameters
Whether there are enough SVD coefficients to accurately represent the lightcurves (especially since loss curves look good)
Whether the NN architecture can be changed to provide a better mapping between params and SVD coeffs

ThibeauWouters commented 7 months ago

I was able to retrain the model and got good inference results, I believe we can now close the issue, but I leave the final decision to the others in this thread (@bfhealy , @shreyasahasram08 , @tsunhopang )

bfhealy commented 7 months ago

I agree with Thibeau's suggestion. I also retrained the model and made the benchmark plot below, which suggests generally good performance with the high reduced chi2 values coming from a few outliers near the edge of the grid. benchmark_percentiles_Bu2023Ye

nuclear-multimessenger-astronomy / nmma

Poor tensorflow training results for new model grid #301