Pin-Jiun / Machine-Learning-MIT

0 stars 0 forks source link

CH3-Perceptron #3

Open Pin-Jiun opened 1 year ago

Pin-Jiun commented 1 year ago

we can convert any problem involving a linear separator with offset into one with no offset (but of higher dimension)!

image

image

Convergence theorem

if the training data Dn is linearly separable, then the perceptron algorithm is guaranteed to find a linear separator

we’ll define the margin of a labeled point (x, y)

image

This quantity will be positive if and only if the point x is classified as y by the linear classifier represented by this hyperplane

the margin of a dataset Dn

image

The margin is positive if and only if all of the points in the data-set are classified correctly. In that case (only!) it represents the distance from the hyperplane to the closest point.

image

image

margin是最靠近直線的資料點到直線的距離 , the margin of the data set is the minimum of γ R是資料點離原點最遠的距離

Proof.

We initialize θ0 = 0, and let θ(k) define our hyperplane after the perceptron algorithm has made k mistakes. We are going to think about the angle between the hypothesis we have now,θ(k) and the assumed good separator θ*. Since they both go through the origin, if we can show that the angle between them is decreasing usefully on every iteration, then we will get close to that separator.

let’s think about the cos of the angle between them, and recall, by the definition of dot product:

image image

start by focusing on the first factor image

image

θ(k-1)可以一直往下寫, 寫成θ(k-2)+γ, 最終寫成θ0, 內積後=0

Now, we’ll look at the second factor

關鍵是因為xi,yi分類錯誤, 所以 image

image

image

Since the value of the cosine is at most 1, we have image

margin是最靠近直線的資料點到直線的距離 R是資料點離原點最遠的距離

Pin-Jiun commented 1 year ago

here are some you should be familiar with for this assignment:

np.argmax np.array_split, look at the axis argument np.concatenate, look at the axis argument np.zeros numpy.ndarray.shape numpy logic functions e.g. np.array([[1, 2],[3, 4]]) == 3

Pin-Jiun commented 1 year ago

image

γ雖然沒有改變, 但會影響更新的速度, 因為每次只能更新一點

Pin-Jiun commented 1 year ago

image

image

image

γ是最靠近直線的資料點距離 R是資料點離原點最遠的距離

Pin-Jiun commented 1 year ago

Consider a linearly separable dataset with two features:

 data = ([[200, 800, 200, 800],
             [0.2,  0.2,  0.8,  0.8]])
    labels = [[-1, -1, 1, 1]]

image image

γ是最靠近直線的資料點距離 , the margin of the data set is the minimum of γ R是資料點離原點最遠的距離

What is the margin γ of this data set with respect to that separator (up to 3 decimal places)? image

What is the theoretical bound on the number of mistakes perceptron will make on this problem? image

How many mistakes does perceptron through origin have to make in order to find a perfect separator on the data provided above, in the order given? (Try it on your computer, not by hand!) Solution: 666696

image image

How would the performance of the perceptron (as predicted by the mistake bound) change? image

If we multiplied just the first original feature (first row of the data) by .001, and used our original separator, what would the new margin be? image

What would the mistake bound be in this case? image

Run the perceptron algorithm on this data; how many mistakes does it make? Solution: 7 Explanation:The speedup in running time is very significant and indicates the benefits of choosing features that help you in finding a classifier for your data.