pangfengliu / programmingtasks

programming tasks from my courses
68 stars 17 forks source link

Linear Classifier #329

Open littlehug opened 7 years ago

littlehug commented 7 years ago

Write a simple machine learning program. In the first training stage, we train a model. In the second prediction stage we classify inputs according to the model we trained.

Task Description

An image classifier classifies input images into two categories -- accept and reject. An Image is an M by M array P of pixels, where each pixel p{i, j} is either zero or one. A classifier determines which category an image belongs to based on a function value h of all pixels of this image. There is a weight w{i,j} for every pixel p{i,j}, and the h function value of an image is the sum of products of all pixel values with their corresponding weights.(h(P) = \sum{i,j} p{i, j} w{i, j}) If the h value is no less than a threshold T, which is 2M^2, then the model accepts the image, otherwise it rejects the image.

The classifier starts with an initial weights of all 1's, then modify the them according to a set of N train inputs. Each input is an M by M array of 0 or 1's, and has a label of either accept or reject. If the model from the current set of weights gives the same result as the label of the input image, then we do nothing. Otherwise we have two cases. In the first case the classifier accept the input but the label is reject. Then we divide the weight of those non-zero pixel by 2, and set the weight to 1 if the the division gives 0. In the second case the classifier rejects the input but the label is accept. Then we multiply the weight of those non-zero pixel by 2.

After we train our model we use it to classify another set of Q test inputs. We evaluate the h function using the weights we obtained in the previous stage, and accept the input if the h value is no less than the threshold, and reject the input otherwise.

每一個人都有各自的喜好,喜好程度分成可接受與不可接受兩種。這一題專注於圖片辨識上,目標是判斷圖片屬於接受 1 與不接受 0。圖片由一個二維矩陣 P 構成,這裡化簡問題,每一個矩陣元素 p_{i,j} 只有 0 或 1。

我們已知 N 張 pf 可接受和不可接受的圖片,希望可以藉由程式預測新的圖片受不受 pf 喜愛,判斷的方法如下所述:

  1. 設計一個簡單的線性分類器,線性分類器判定函數 h(P) = \sum{i,j} p{i, j} w_{i, j}, 當 h(P) >= threshold 成立時,屬於可接受 1。反之,為不可接受 0。方便起見,threadhold = 2M^2。
  2. 為了找出 w{i, j},我們需要一套算法來完成訓練 a. 初始化 w{i, j} = 1 b. 根據輸入的 N 組訓練資料,依序調整 w_{i, j},當輸入圖片代入 h(P) 分成以下三種情況:
    1. 若函數 h(P) 判定與已知分類相同時,什麼都不做。
    2. 若函數 h(P) 判定與已知分類不同時且已知分類為可接受 1,將 p{i, j} = 1 對應的 w{i, j} 乘 2 。
    3. 若函數判定與已知分類不同時且已知分類為不可接受 0,將 p{i, j} = 1 對應的 w{i, j} 除 2。若發生 w{i, j} < 1,則 w{i, j} 應再次被修改成 1。
  3. 對於隨後輸入的 Q 組詢問資料,直接輸出 h(P) >= threshold 成立與否。

Input

There is one test case. Each test case has two parts.

In the first part, the training data set contains N images. The first line contains two integers N, M, representing that there are N images which size is M x M as follows: Each image contains M+1 lines. In M+1 lines, the first line contains an integer C, representing the result of each images. In following M lines, each line contains M integers.

In the second part, the query data set contains Q images. The first line contains an integer Q, representing Q images as follows: Each images contains M lines. Each line contains M integers.

只有一組測資。每一組測資分成兩個部份。

第一部份為訓練資料,第一行有兩個整數 N、M,表示接下來會有 N 張 M x M 的 1-bit color 圖片。接下來有 N 組圖片資訊,每一組圖片資訊有 M+1 行,每一組第一行為分類結果 C,接著的 M 行上分別 M 個 0/1 整數表示圖片內容。

第二部份為測試資料,第一行有一個整數 Q,表示接下來會有 Q 張圖片詢問,每一組詢問佔有 M 行,每 一行上有 M 個整數表示圖片內容。

Output

For each query, output one line, which contains an 0/1 integer.

Sample Input

4 2
0
1 1
0 1

1
0 0
1 1

0
1 1
0 0

1
0 0
1 1

2
0 1
0 0

0 0
1 1

Sample Output

0
1