mattjshannon / swsnet

Training neural networks with the Sloan SWS astronomical dataset.
Other
0 stars 0 forks source link

Should the SWS group 7 (flawed spectra) data be included in the models? #6

Open mattjshannon opened 6 years ago

mattjshannon commented 6 years ago

This question is pertinent because it may be useful for the models to be able to recognize when data is 'bad'. Alternatively, by excluding these spectra (numbering 177 of the ~1235 in total), we are effectively doing additional preprocessing -- this has resulted in improved accuracy by using logistic regression.

Will leave this as something to ponder, as it might be more philosophy than pragmatism.

mattjshannon commented 6 years ago

Any thoughts/suggestions, @PAHdb?

PAHdb commented 6 years ago

If there is no logical rhyme-or-reason to the 'flawed' spectra I would think that the neural network has a difficulty grouping them. So the question is whether the network is able to group these 'flawed' spectra--if it does, it should probably keep it, as it is 15% of the data set and I'm sure there are 'flawed' spectra the Spitzer set. Maybe for now, train two networks and see how they perform ...