yeeun987 / TensorFlow-Course

https://lab.github.com/everydeveloper/introduction-to-tensorflow
0 stars 0 forks source link

MakeModel #3

Open github-learning-lab[bot] opened 3 years ago

github-learning-lab[bot] commented 3 years ago

Preprocessing the dataset

The greyscale assigned to each pixel within an image has a value range of 0-255. We will want to flatten (smoosh… scale…) this range to 0-1. To achieve this flattening, we will exploit the data structure that our images are stored in, arrays. You see, each image is stored as a 2-dimensional array where each numerical value in the array is the greyscale code of particular pixel. Conveniently, if we divide an entire array by a scalar we generate a new array whose elements are the original elements divided by the scalar.

>>> train_images = train_images / 255.0
>>> test_images = test_images / 255.0
>>>

Two vital notes about the above.

  1. Use the value "255.0". This value is a floating point number (float), and will always return a float during algebraic operations. In Python, the division operator always returns a float to avoid rounding; but, that is not true for all programming languages, so it's a good habit to include that decimal because it automatically sets that number to be a float.
  2. Do not rescale the train_labels or test_labels arrays, these values are already in the range 0-9, as they should be!

Enter a comment (TRUE or FALSE) about the following statement:

We need to rescale both the images and labels, so they are on the same scale.

yeeun987 commented 3 years ago

FALSE

github-learning-lab[bot] commented 3 years ago

Nailed it!

Remember, the label arrays are only used to associate images with their lables.

Model Generation

Every NN is constructed from a series of connected layers that are full of connection nodes. Simple mathematical operations are undertaken at each node in each layer, yet through the volume of connections and operations, these ML models can perform impressive and complex tasks.

Our model will be constructed from 3 layers. The first layer – often referred to as the Input Layer – will intake an image and format the data structure in a method acceptable for the subsequent layers. In our case, this first layer will be a Flatten layer that intakes a multi-dimensional array and produces an array of a single dimension, this places all the pixel data on an equal depth during input. Both of the next layers will be simple fully connected layers, referred to as Dense layers, with 128 and 10 nodes respectively. These fully connected layers are the simplest layer in the sense of understanding, yet allow for the greatest number of layer-to-layer connections and relationships.

The final bit of hyper-technical knowledge you'll need to learn is that each layer can have its own particular mathematical operation. These activation functions determine the form and relationship between the information provided by the layer. The first dense layer will feature a Rectified Linear Unit (ReLU) Activation Function that outputs values between zero and 1; mathematically, the activation function behaves like f(x)=max(0,x). The final layer uses the softmax activation function. This function also produces values in the 0-1 range, BUT generates these values such that the sum of the outputs will be 1! This makes the softmax a layer that is excellent at outputting probabilities.

>>> model = keras.Sequential([ keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(128, activation=tf.nn.relu), keras.layers.Dense(10, activation=tf.nn.softmax)])

WARNING: Logging before flag parsing goes to stderr.
W0824 22:50:02.551490  8392 deprecation.py:506] From C:\Users\ross.hoehn\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
>>>

Enter a comment (TRUE or FALSE) about the following statement:

The softmax activation function will generate values which will add up to 1

yeeun987 commented 3 years ago

TRUE

github-learning-lab[bot] commented 3 years ago

That's right! ✔️

Softmax activation not only flattens each value (between 0 and 1) but also scales everything to add up to 1.

Training the Model

Models must be both compiled and trained prior to use. When compiling we must define a few more parameters that control how models are updated during training (optimizer), how the model's accuracy is measured during training (loss function), and what is to be measured to determine the model's accuracy (metrics). These values were selected for this project, yet are generally dependent on the model's intent and expected input and output.

>>> model.compile( optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
>>>

Now we can begin training our model! Now, with already having generated and compiled the model, the code required to train the model is a single line.

>>> model.fit(train_images, train_labels, epochs=5)

2019-08-24 22:56:32.884249: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
Epoch 1/5
60000/60000 [==============================] - 2s 40us/sample - loss: 0.4985 - acc: 0.8264
Epoch 2/5
60000/60000 [==============================] - 2s 36us/sample - loss: 0.3787 - acc: 0.8632
Epoch 3/5
60000/60000 [==============================] - 2s 36us/sample - loss: 0.3368 - acc: 0.8766
Epoch 4/5
60000/60000 [==============================] - 2s 35us/sample - loss: 0.3122 - acc: 0.8863
Epoch 5/5
60000/60000 [==============================] - 2s 35us/sample - loss: 0.2962 - acc: 0.8901
<tensorflow.python.keras.callbacks.History object at 0x00000133F219C470>
>>>

This single line completes the entire job of training our model, but let's take a brief look at the arguments provided to the model.fit command.

  1. The first argument is input data, and recall that our input Flatten layer takes a (28,28) array, conforming to the dimensionality of our images.
  2. Next we train the system by providing the correct classification for all the training examples.
  3. The final argument is the number of epochs undertaken during training; each epoch is a training cycle over all the training data. Our setting the epoch value to 5 means that the model will be trained overall 60,000 training examples 5 times. After each epoch, we get both the value of the loss function and the model's accuracy (88.97% after epoch 5) at this epoch.

Leave a comment with the answer to this question:

Which argument in the model.fit method is used to classify our data into categories?(1,2, or 3)?