aimalz / proclam

PRObabilistic CLAssification Metrics for PLAsTiCC
Other
12 stars 7 forks source link

Comments from Saurabh #70

Closed reneehlozek closed 6 years ago

reneehlozek commented 6 years ago

These comments from were received on September 12 on the metrics note.

I noticed the internal capitalization of PLASTICC was not consistent across the metrics paper versus the release note (and internally inconsistent in the release note). I suggest you decide on one method and use the same latex macro on both papers.

@reneehlozek:

I've tried to make these changes, thanks!

Here are some brief comments on the metrics paper:

I found the abstract a little off: it dives into the idea of "probabilistic classfications" too quickly from generic "open challenges" and "astronomical data analysis". Also claiming the combination that you end up with as "optimal" seems at odds with the idea in the abstract that you are "balancing a variety of goals" and the stuff in the text that emphasizes you can only optimize for a specific science goal and use case. Using "reassuring" may be a bit too much editorializing, but I was confused because it was not described whether the two metrics were similar (and so expected to give the same answer) or very different (in which case agreement would be reassuring). "log-loss" shows up with no definition and is not a familiar word/term for most astronomers.

Here's a suggested rewrite (that you will need to tweak further; this is just a suggestion)

Classification of astronomical sources is a key tool to develop physical understanding. Open data challenges can effectively promote the development of novel techniques in automated classification. Traditional metrics to judge classification performance, designed for deterministic classifications, are incompatible with the probabilistic classifications appropriate to upcoming astronomical surveys like the data from the Large Synoptic Survey Telescope (LSST). Furthermore, science collaborations may use the products of these challenges for diverse science objectives, indicating a need for a classification metric that balances a variety of goals. Here we describe the development of a useful performance metric addressing both of these issues in the context of the Photometric \textsc{LSST} Astronomical Time-series Classification Challenge (\plasticc). \plasticc is an open competition aiming to identify promising methods for probabilistic classification of of transient and variable objects in simulated LSST data by engaging a broader community both within and outside astronomy. Using mock classification probability submissions spanning characteristic performance archetypes anticipated in \plasticc, we compare the sensitivity of two classes of metrics, the log-loss (cross-entropy) and Brier score, to classifier systematics and find qualitatively consistent results. For \plasticc we choose a weighted modification of the log-loss metric that allows for more meaningful interpretation and superior sensitivity to the scientifically most-concerning potential failures of classification. We propose extensions of our procedure for more complex challenge goals and suggest some guiding principles for approaching the choice of a metric of probabilistic classifications.

@aimalz and I both took a stab at rewriting, so please check again!

Other minor comments (I don't have the expertise to really provide much guidance for the text):

"The Large Synoptic Survey Telescope (LSST) has the potential to advance time-domain astronomy, with an- ticipated impacts on the study of transient and variable (T&V) objects within and beyond the Milky Way." How about some more optimism? "The Large Synoptic Survey Telescope (LSST) will revolutionize time-domain astronomy and the study of transient and variable (T&V) objects within and beyond the Milky Way."

@reneehlozek: > Done

"Notions of accuracy, purity, completeness, and others endemic to science" (not sure what is "endemic to science" here).. could add things like "false positive rate"? There are other terms used in other disciplines, right? Like specificity and sensitivity? @reneehlozek: > Done

Authorship -- it is totally fine with me if the author list is as is. I would then suggest just an acknowledgment to the "rest of the PLASTICC" team.

However, I think Rick in his comments was expressing surprise that he would not be a coauthor (i.e., he is listed in the acknowledgments), and I had imagined that all of the PLASTICC folks and also general DESC folks who contributed would be co-authors of this paper.

If you decide to go that route with an expanded author list (and that's your decision, either way is fine!), please include

Mi Dai Saurabh W. Jha both with affiliation Rutgers, the State University of New Jersey, 136 Frelinghuysen Road, Piscataway, NJ 08854 USA @reneehlozek: > I've added the few PLAsTiCC team members to this and added "PLAsTiCC team member" to their affiliations.

and then please also add an acknowledgment for us: This research at Rutgers University is supported by US Department of Energy award DE-SC0011636.

@reneehlozek: > Done