JinwoongKim / cs231n

0 stars 0 forks source link

assignment1-1 #1

Open JinwoongKim opened 6 years ago

JinwoongKim commented 6 years ago

Setting

After running jupyter notebook with knn.ipynb, I faced an error as follows,

screen shot 2018-08-12 at 14 06 11

I solved this problem, by adding "backend: TkAgg" into ~/.matplotlib/matplotlibrc with below command, echo "backend: TkAgg" >> ~/.matplotlib/matplotlibrc and restart jupyter notebook.

JinwoongKim commented 6 years ago

def compute_distances_two_loops(self, X):

According to the Euclidian formula,

  for i in xrange(num_test):
      for j in xrange(num_train):
        #####################################################################
        # TODO:                                                             #
        # Compute the l2 distance between the ith test point and the jth    #
        # training point, and store the result in dists[i, j]. You should   #
        # not use a loop over dimension.                                    #
        #####################################################################
        dists[i,j] = np.sqrt(np.sum((X[i]-self.X_train[j])**2))
        #####################################################################
        #                       END OF YOUR CODE                            #
        #####################################################################
    return dists

Q. What in the data is the cause behind the distinctly bright rows?

Q. What causes the columns?

JinwoongKim commented 6 years ago
  def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.

    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in xrange(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = []
      #########################################################################
      # TODO:                                                                 #
      # Use the distance matrix to find the k nearest neighbors of the ith    #
      # testing point, and use self.y_train to find the labels of these       #
      # neighbors. Store these labels in closest_y.                           #
      # Hint: Look up the function numpy.argsort.                             #
      closest_y.append(self.y_train[np.argmin(np.argsort(dists[i]))])
      #########################################################################
      #########################################################################
      # TODO:                                                                 #
      # Now that you have found the labels of the k nearest neighbors, you    #
      # need to find the most common label in the list closest_y of labels.   #
      # Store this label in y_pred[i]. Break ties by choosing the smaller     #
      # label.                                                                #
      #########################################################################
      y_pred[i] = closest_y[0]
      #########################################################################
      #                           END OF YOUR CODE                            #
      #########################################################################

    return y_pred

Got 38 / 500 correct => accuracy: 0.076000

Since I did something wrong, I checked my code. I tried to use argsort because of the hint, but, it seems that, in my approach, I don't need to use it, so I removed it.

JinwoongKim commented 6 years ago
    def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.

    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in xrange(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = []
      #########################################################################
      # TODO:                                                                 #
      # Use the distance matrix to find the k nearest neighbors of the ith    #
      # testing point, and use self.y_train to find the labels of these       #
      # neighbors. Store these labels in closest_y.                           #
      # Hint: Look up the function numpy.argsort.                             #
      closest_y = [self.y_train[np.argmin(dists[i])]]
      #########################################################################
      #########################################################################
      # TODO:                                                                 #
      # Now that you have found the labels of the k nearest neighbors, you    #
      # need to find the most common label in the list closest_y of labels.   #
      # Store this label in y_pred[i]. Break ties by choosing the smaller     #
      # label.                                                                #
      #########################################################################
      y_pred[i] = np.array(closest_y)

      #########################################################################
      #                           END OF YOUR CODE                            #
      #########################################################################

    return y_pred

Got 137 / 500 correct => accuracy: 0.274000

Now, accuracy is resulted out as expected.

Since dists[i] has distance between ith test and all training sets like [ 3803.92350081 4210.59603857 5504.0544147 ..., 4007.64756434 4203.28086142 4354.20256764], we have to choose the minimum values among these values. To this end, I use argmin.

JinwoongKim commented 6 years ago

Next step requires me to predict labels with 5NN, but my approach can't do that. So I googled "how to pick 5 minimum with argmin in bumpy", and I found this,

screen shot 2018-08-12 at 20 03 11 link

Then, I've realized that, why they suggest me use 'argsort' LOL

Instead of argmin, python closest_y = [self.y_train[np.argmin(dists[i])]]

I use argsort now python closest_y = [self.y_train[np.argsort(dists[i])[:k]]]

JinwoongKim commented 6 years ago

To sum up KNN result, I googled "most frequent number in numpy array" https://stackoverflow.com/questions/6252280/find-the-most-frequent-number-in-a-numpy-vector

So, I updated my code, python y_pred[i] = np.array(closest_y) to python y_pred[i] = np.argmax(np.bincount(closest_y)) But I faced this error,

ValueError: object too deep for desired array

JinwoongKim commented 6 years ago

I removed parenthesis from [closest_y = [self.y_train[np.argmin(dists[i])]]] and it works.

Accuracy is slightly increased

Got 139 / 500 correct => accuracy: 0.278000

JinwoongKim commented 6 years ago

compute_distances_one_loop

First try; python dists[i,:] = np.sqrt(np.sum((X[i]-self.X_train)**2)) Difference was: 7906696.077041 Uh-oh! The distance matrices are different

Second try: python dists[i,:] = np.sqrt(np.sum((X[i]-self.X_train[:,:])**2)) Difference was: 551335735.545133 Uh-oh! The distance matrices are different

then I start to think,

JinwoongKim commented 6 years ago

I googled "l2 distance in one loop" and found this, Numpy Broadcast to perform euclidean distance vectorized

Only difference is axis,

First try

...
dists[i,j] = np.sqrt(np.sum((X[i]-self.X_train)**2))
...

Solution

...
dists[i] = np.sqrt(np.sum((X[i] - self.X_train)**2, axis=1))
...

According to this, Axis or axes along which a sum is performed". I don't know what they are talking about. Let us see an example.

>>> np.sum([[0, 1], [0, 5]], axis=0)
array([0, 6])
>>> np.sum([[0, 1], [0, 5]], axis=1)
array([1, 5])

Okay, got it !!

JinwoongKim commented 6 years ago

If we check the result of each calculation, it makes sense.

screen shot 2018-08-13 at 13 21 45 screen shot 2018-08-13 at 13 21 51

For def compute_distances_no_loops(self, X):, I found some blogs and codes but haven't understand yet.

def compute_distances_no_loops(self, X):
...
dists = np.sqrt(np.sum(X**2,axis=1)[:,np.newaxis] + np.sum(self.X_train**2,axis=1) -2*np.dot(X, self.X_train.T))
...
JinwoongKim commented 6 years ago

Cross-validation

First one is quite simple,

################################################################################
# TODO:                                                                        #
# Split up the training data into folds. After splitting, X_train_folds and    #
# y_train_folds should each be lists of length num_folds, where                #
# y_train_folds[i] is the label vector for the points in X_train_folds[i].     #
# Hint: Look up the numpy array_split function.                                #
################################################################################

X_train_folds = np.array_split(X_train,num_folds)
y_train_folds = np.array_split(y_train,num_folds)

# `array_split` splits an array into sub-arrays.
JinwoongKim commented 6 years ago

################################################################################
# TODO:                                                                        #
# Perform k-fold cross validation to find the best value of k. For each        #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times,   #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all     #
# values of k in the k_to_accuracies dictionary.                               #
################################################################################

def get_folds_except_one(folds, n):
    return np.concatenate(tuple([folds[i] for i in range(len(folds)) if i != n]))

for k in k_choices:
    for n in range(num_folds):
        # train all folds except one
        current_x_train_set = get_folds_except_one(X_train_folds,n)
        current_y_train_set = get_folds_except_one(y_train_folds,n)

        classifier.train(current_x_train_set, current_y_train_set)

        dists = classifier.compute_distances_no_loops(X_train_folds[n])
        y_test_pred = classifier.predict_labels(dists, k=k)

        # Compute and print the fraction of correctly predicted examples
        num_correct = np.sum(y_test_pred == y_train_folds[n])
        accuracy = float(num_correct) / y_train_folds[n].shape[0]

        try:
            k_to_accuracies[k].append(accuracy)
        except KeyError:
            k_to_accuracies[k] = [accuracy]
JinwoongKim commented 6 years ago

I found that I can replace these two lines,

dists = classifier.compute_distances_no_loops(X_train_folds[n])
y_test_pred = classifier.predict_labels(dists, k=k)

with this single line

 y_test_pred = classifier.predict(X_train_folds[n], k=k)