comments to address in preprint

nickreich commented 8 months ago

Comments from Johannes

[x] Abstract says "in this chapter" - adpat this for a preprint?
[x] Also wondering if it should be in present tense.
[x] page 2: "if it is consistent with a scoring rule": not sure I'd use "consistent" here as this sounds like a technical statistical term related to estimation of something. My suggestion: "it seems natural to choose divergence functions associated with proper scoring rules"
[x] page 2: "(the difference between proper and regular expected scores of two probabilistic forecasts under the same true distribution)" I'm struggling to understand what this sentence means.
[x] page 2: "Properties of the Cram\'er distance include sum invariance" I don't know what sum invariance means here and could not find mention of it in the Rizzo paper NGR: Noting that this was located in the Bellemare paper so I added that citation.
[ ] Fig 1: should y label read "CDF"?
[x] page 3: "The Cram\'er distance is twice the squared energy distance" as far as I know, Cramér distance is just twice the energy distance (without squaring) NGR: Per wikipedia, I actually think that this is correct as is.
[x] page 5: the part after eq (6) is incredibly hard to follow in terms of notation, even though the idea behind it is fairly straightforward (based on a drawing it would be very simple to understand). Moreover I'm worried whether the definition of tau_i^F is really correct. I think that if some quantile values occur both in q^F and q^G, some things are countred twice here when they should not be. Maybe one needs to eliminate double entries from the set q, but I'm not sure. I think there may be an alternative which reads roughly tau_i^F = tau_k with k = max(j: q_j^F \leq q_i), but I can't quite wrap my head around it right now.
[x] under eq (7) you talk about an average, but it seems like more of a mix between an average and a sum? Are L and D the same for all pairs of models? NGR: I think the answer here is that L and D are the same for all pairs, by design as this was taken as a complete subset and only a small number of locations.
[ ] "Permutations were restricted to be on the model types only and the permuted model types were kept constant across locations and target end dates." not sure I understand what this means
[ ] As a general remark which I realize comes way to late: Maybe a good idea would have been to move all forecasts to the log scale first. That would make the Cramér distances (like in Nikos' paper) much less dependent on the scale and the problem of incomplete forecast sets would have been less severe. But that would likely require a massive rewrite, so does not seem feasible at this stage.

nickreich commented 8 months ago

Addressing the comment about "Permutations were restricted to be on the model types only and the permuted model types were kept constant across locations and target end dates."

My sense is that another way of saying this is that to create one permuted dataset, we did NOT just permute the entire column of model types across all locations, dates, etc... but rather we took the a table of models (one row per model with columns for model name and model type) and permuted the model type column. Then this was joined back in to the large dataset so that all rows pertaining to a given model were assigned that model's permuted model-type value. Would like some validation on this interpretation before trying to clarify in the main text.

nickreich commented 8 months ago

Making a note regarding question above about notation on p5. I worked through this in detail (see attached whiteboard shot) and think it all checks out. (Although I didn't cross-check the approximation sum against what is coded up.) Yes it is complicated (perhaps overly so), but it does seem to check out. Regarding the "counting q's twice" I didn't see that as a problem in my working out of it, as I think the {q^G-q^F} takes care of that. I think Johannes' suggestion about the max also would work if you defined \tau_0 = 0 and looked at values of j between 0 and 2*K-1. That might be simpler but I guess I don't see a huge value in changing it, as it doesn't change much of the overall complexity of the set-up and I'm not convinced that it's wrong as it stands now.

PXL_20231221_212511620

NutchaW / covid19_forecast_similarity

comments to address in preprint #20