Include point estimate (mean) of bootstrap

flipz357 / smatchpp

A package for handy processing of semantic graphs such as AMR, with a special focus on standardized evaluation

GNU General Public License v3.0

19 stars 2 forks source link

Include point estimate (mean) of bootstrap #6

Closed BramVanroy closed 11 months ago

BramVanroy commented 1 year ago

If I understand the output correctly, when we are bootstrapping we get results like this:

        "result": 81.3,
        "ci": [
            80.67,
            81.89
        ]

I think that the result is calculated independently, on the full corpus, and ci is the 95% CI min/max. It would be useful to also include the estimated mean based on the bootstrap. As far as I can tell, this is common in research papers too, where you report "85 +- 1.2" where 85 is the estimated mean and 1.2 the CI range with 95% confidence.

flipz357 commented 1 year ago

Interesting, do you have a paper where the "85" really is the estimated mean and not just the sample statistic? I guess this discussion may be related, where the highest voted answers says

... The bootstrapped mean value is not a better estimator for your population parameter....

(than the sample statistic "result")

Technically, I think the bootstrap mean could be simply calculated from the bootstrap_distribution attribute of the scipy res object that you get from the scipy bootstrap, so it should be easy to implement.

Maybe it can look like that:

"result": 81.3,
 "ci": [
          80.67,
          81.89
        ],
  "mean": x

but seeing this I think that it would maybe add more complexity in understanding/reporting the results than it may help?

BramVanroy commented 1 year ago

I agree that it might be confusing, and to be honest I am not 100% my request makes sense from a statistical perspective. I was creating the following graph.

amr bar plot + CI

These were created with the output of smatchpp, so "result" at the top and the the ci within brackets. In terms of notation, it would be easier to be able to write mean +- 2*stdev. But that is not exactly the case, as an example:

for the right most plot, the result is 73.4. But the midpoint between 72.7 and 74.0 would be 73.5 (73.5+-0.65)

While these are very close, they are not exactly in the middle because the CI is calculated through bootstrapping and result is just the single calculation for the whole population, if I understand correctly. So in terms of notation we can't rightly report it as 73.5+-0.65 (because the this 73.5 is not the mean of the bootstrap but the independet calculation on the full poplation).

I am not sure whether it is clear what I am trying to say, sorry!

flipz357 commented 1 year ago

I think it is clear what you want to say :-) You want to use the +- notation and then there are problems

boostrap mean != sample statistic, which would make +- weird
confidence intervals can also be asymetric, which would also make +- weird

While I think the standard deviation can also be easily obtained with scipy, the confidence interval is understood as more informative. So maybe it can help to slightly change a notation?

What I found nice is the notation that my colleague used in a recent paper (I have seen others use it too). It looks like this:

The tiny left number is the lower confidence interval, the number in the middle the basic sample statistic, and the right tiny number the upper confidence interval.