ZwickyTransientFacility / scope-ml

SCoPe: ZTF source classification project
https://zwickytransientfacility.github.io/scope-docs/
MIT License
9 stars 48 forks source link

Compare DNN and XGB results #151

Open bfhealy opened 2 years ago

bfhealy commented 2 years ago

Once DNN training is complete, we should begin our inference on the same fields as analyzed by the XGB algorithm to compare the results from each. The comparison may identify areas of improvement for one or both algorithms. We can then expand inference to all fields.

bfhealy commented 2 years ago

@markkennedy3 Once all 20 fields of DNN inference are available, could you please remake your multi-panel comparison plot from a few months ago?

mkenne15 commented 2 years ago

Will do. Do you want me to include the notebooks for producing them in scope? Maybe in the tools dir?

bfhealy commented 2 years ago

That sounds like a good idea, thanks!

mkenne15 commented 1 year ago

Are the DR5 predictions for the XGB model the ones we'd like to compare the current DNN results to? If so, I've recreated the plot I sent to you a few months ago, but replaced the DNN results with those from the 20 fields predictions (attached). I've had to do a bit of massaging as the msms column in the XGB results is now called emsms in the DNN table.

XGB_vs_DNN_Distributions

bfhealy commented 1 year ago

Thanks Mark, this is a very helpful visualization! I think the existing DR5 XGB predictions are the ones we should compare to the new DNN results.

AshishMahabal commented 1 year ago

Thanks, Mark. Can you clarify which objects are shown in the plots? For instance, in the VNV plot the XGB column is close to one while the DNN column is close to 0. Also, the binwidths for XGB and DNN seem to be diferent.

mkenne15 commented 1 year ago

Hey Ashish. Each figure plots the histogram of probabilities that either the XGB or DNN has assigned to all objects that each classifier has seen, for all of the different classifications. The binwidths are indeed different, at the moment I'm just letting numpy automatically determine the appropriate binwidths for each class.

Also, as you highlighted, there are some peculiar behaviours occurring here. As you higlighted, vnv is very different between the XGB and DNN classifiers (for the XGB just about everything is classified as a VNV, while nothing is when using the DNN). I think the distribution for the DNN makes sense to me, as these results are after running the DNN on 20 fields, and we'd expect the majority of objects to be non-variable assuming we were classifying absolutely every object in those 20 fields (and not just selecting variable objects and classifying them - Brian maybe you could clarify?). I need to think a bit more about the XGB distribution, but other panels suggest the XGB is struggling anyway (consider YSO's, where it thinks every object should have a minimum 33% change of being a YSO!)

AshishMahabal commented 1 year ago

Density plots with variable binwidths can be an issue. Can you produce pure histograms for one or two (vnv and yso would be great since you named those). So, numeric y-axis rather than density. In fact it could be worth plotting XGB and DNN in two separate panels so that a side-by-side comparison can be done.

mkenne15 commented 1 year ago

Hey Ashish. Do you mean something like this? (I haven't done side-by-side, but rather just set up separate y-axes to allow for easier comparisons.

VNV_probabilities

mkenne15 commented 1 year ago

As a quick follow up, I've recreated the plots, but filtered the DNN results to remove any object where P(VNV)<0.9, such that we can better see when an object is being classified as variable, what the probabilities of the it belonging to one of the other classes are coming out as. (This probably makes more sense to look at than the previous large panel plot, as the spikes at 0 for each class when looking at the DNN results are due to non-variable sources).

XGB_vs_DNN_Distributions_vnv

bfhealy commented 1 year ago

Thanks Mark - I can confirm that the DNN 20-field classifications are performed on all sources in each field, not just those previously identified as variable. Thus it could make sense for many of the results to be labeled as non-variable, especially if the training set for the vnv classifier has any bias toward the brighter sources in a field:

While the mean Gaia G mag of variable sources in the current training set is ~16.9, my examination of a few fields finds a mean G mag ~18.3. The fainter variable sources in these fields may not be picked up by the vnv classifier due to added noise in their light curves.

Related, a challenge with AGN/YSO is their tendency to display stochastic variability that can be confused with noise for faint sources. It is possible that many faint variables that are not intrinsically AGN/YSO are being classified as such because the training doesn't establish a clear enough distinction between faint variables and true AGN/YSO.