ecr23xx / cs231n

My solutions and notes for Stanford CS231n, Spring 2018. Notes could be seen in Wiki pages
0 stars 1 forks source link

Logs #1

Closed ecr23xx closed 5 years ago

ecr23xx commented 5 years ago

This issue is used for updating my progress in doing assignments, including problem I met, skills I learnt etc. Pay attention that discussions are not welcomed in this issue. If you have anything to share with me, please open another issue.

ecr23xx commented 5 years ago

Introduction

Motivation

When I'm doing another project that based on yolov3.pytorch, I came across a problem in Batch Normalization. My understanding of BN just limits in nn.BatchNorm, and when I want to do things like computing accumulated gradients, I got stuck in because I'm not familiar with computation of running mean/variance or something like that. This is the motivation of doing CS231n assignments again. I hope it could push me to revise those classical algorithms, and go beyond than just knowing interfaces of PyTorch or TensorFlow.

General plans

In general, I will finish every part of this assignments (again), and writing down what I learnt in such a learning process. If you want to catch up with the latest updates, please click "Subscribe" in the right. Updates in this issue will be sent to your GitHub Notification Center.

ecr23xx commented 5 years ago

kNN classifier

Decision boundary

kNN is not a linear classifier. Decision boundaries of kNN are composed of different pieces of lines like shown below. Stated in a more formal way, different data will fall in different regions, and these regions can't be separated by a hyperplane.

L1 v.s. L2

In particular, the L2 distance is much more unforgiving than the L1 distance when it comes to differences between two vectors. That is, the L2 distance prefers many medium disagreements to one big one. L1 and L2 distances (or equivalently the L1/L2 norms of the differences between a pair of images) are the most commonly used special cases of a p-norm

Cross validation

For example, in 5-fold cross-validation, we would split the training data into 5 equal folds, use 4 of them for training, and 1 for validation. We would then iterate over which fold is the validation fold, evaluate the performance, and finally average the performance across the different folds

Advice on applying kNN

  1. Use kNN as baseline instead of real application
  2. Preprocess your data: Normalize the features in your data (e.g. one pixel in images) to have zero mean and unit variance
  3. If your data is very high-dimensional, consider using a dimensional reduction technique such as PCA
  4. Split your training data randomly into train/val splits. As a rule of thumb, between 70-90% of your data usually goes to the train split.
  5. If your kNN classifier is running too long, consider using an Approximate Nearest Neighbor library (e.g. FLANN) to accelerate the retrieval (at cost of some accuracy).

References

ecr23xx commented 5 years ago

Linear Classifier - SVM and Softmax

Multiclass Support Vector Machine

Softmax

ecr23xx commented 5 years ago

Moved to wiki page