mikeizbicki / cmc-csci040

Computing for the Web
37 stars 58 forks source link

Possible error in notes #318

Open buffeinstein opened 2 weeks ago

buffeinstein commented 2 weeks ago

In Chapter 1, the learning problem 2, there is a question: If you have a test set with 1000 samples, what bound does the Hoeffding inequality give on the probability that $|E{out}(g)) - E{test}(g)| < 0.01$?

Below are two solutions. I think the first one is right, but the notes indicate that the second one is right. I noticed that if you paste "2 \exp (-2(0.01)^2 (1000)" directly into wolframalpha, you get the second solution.

I believe the first one is correct because we're working with probability curves, so there should be no negative numbers. Would love some insight as to why we should get a negative answer!


Hoeffding: $$\mathbb{P} (|\nu - \mu| \ge \epsilon) \leq 2 \exp (-2\epsilon^2 N)$$

Using this, we get the probability that it is greater than or equal to than $\epsilon$ is equal to $$2 \exp (-2\epsilon^2 N)$$ $$2 \exp (-2(0.01)^2 (1000)$$ $$= 0.8705506.$$

We want to find the probability of the opposite, ie the probability it is less than than $\epsilon$. Thus we do $$1 -0.8705506 = 0.129$$.


Hoeffding: $$\mathbb{P} (|\nu - \mu| \ge \epsilon) \leq 2 \exp (-2\epsilon^2 N)$$

Using this, we get the probability that it is greater than or equal to than $\epsilon$ is equal to $$2 \exp (-2\epsilon^2 N)$$ $$2 \exp (-2(0.01)^2 (1000)$$ $$= 1.63746. $$

We want to find the probability of the opposite, ie the probability it is less than than $\epsilon$. Thus we do $$1 -1.63746 = -0.637$$.

mikeizbicki commented 2 weeks ago

I believe the notes are correct. The answer to the second part has a short explanation about the negative bound that might be helpful.

One of the main misconceptions I'm seeing in your question, however, is that you talk about the probability being "equal to [complex expression]" at several points. But Hoeffding never tells us the exact probability, it only tells us bounds on the probability. And you're correct that a bound that shows the probability is greater than a negative number is not very helpful.