Saini-96 commented 6 years ago

While running the code getting following error:-

Starting gradient descent at b = 0, m = 0, error = nan Running... After 1000 iterations b = nan, m = nan, error = nan plottting..

Code:-

import numpy as np import pandas as pd import math import matplotlib import matplotlib.pyplot as plt from sklearn import datasets, linear_model from pandas import DataFrame, Series from sklearn.metrics import mean_squared_error

data = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data-original", delim_whitespace = True, header=None, names = ['mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model', 'origin', 'car_name'])

The optimal values of m and b can be actually calculated with way less effort than doing a linear regression.

this is just to demonstrate gradient descent

from numpy import *

y = mx + b

m is slope, b is y-intercept

def compute_error_for_line_given_points(b, m, points): totalError = 0 for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] totalError += (y - (m * x + b)) ** 2 return totalError / float(len(points))

def step_gradient(b_current, m_current, points, learningRate): b_gradient = 0 m_gradient = 0 N = float(len(points)) for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] b_gradient += -(2/N) (y - ((m_current x) + b_current)) m_gradient += -(2/N) x (y - ((m_current x) + b_current)) new_b = b_current - (learningRate b_gradient) new_m = m_current - (learningRate * m_gradient) return [new_b, new_m]

def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations): b = starting_b m = starting_m for i in range(num_iterations): b, m = step_gradient(b, m, array(points), learning_rate) return [b, m]

def plot_dp():

Input_file = np.genfromtxt('auto-mpg1.csv', delimiter=',', skip_header=1)
Num = np.shape(Input_file)[0]
X = np.hstack((np.ones(Num).reshape(Num, 1), Input_file[:, 4].reshape(Num, 1)))
Y = Input_file[:, 0]

X[:, 1] = (X[:, 1]-np.mean(X[:, 1]))/np.std(X[:, 1])

wght = np.array([0, 0])

max_iter = 1000
eta = 1E-4
for t in range(0, max_iter):
    grad_t = np.array([0., 0.])
    for i in range(0, Num):
        x_i = X[i, :]
        y_i = Y[i]

        h = np.dot(wght, x_i)-y_i
        grad_t += 2*x_i*h

    wght = wght - eta*grad_t

tt = np.linspace(np.min(X[:, 1]), np.max(X[:, 1]), 10)
bf_line = wght[0]+wght[1]*tt

plt.plot(X[:, 1], Y, 'kx', tt, bf_line, 'r-')
plt.xlabel('displacement (Normalized)')
plt.ylabel('MPG')
plt.title('Linear Regression')
plt.show()

def run(): points = genfromtxt("data.csv", delimiter=",") learning_rate = 0.0001 initial_b = 0 # initial y-intercept guess initial_m = 0 # initial slope guess num_iterations = 1000 print "Starting gradient descent at b = {0}, m = {1}, error = {2}".format(initial_b, initial_m, compute_error_for_line_given_points(initial_b, initial_m, points)) print "Running..." [b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations) print "After {0} iterations b = {1}, m = {2}, error = {3}".format(num_iterations, b, m, compute_error_for_line_given_points(b, m, points))

print('plottting..') plot_dp()

if name == 'main': run()

QasimWani commented 5 years ago

Same Error here... Any solution?

Hegabovic commented 5 years ago

you should increase manipulate " learning_rate " in my case i made it " learning_rate = 0.00001 " and by increasing " num_iterations " it gives more accurate output for b , m and error value decreases

PATHAKABHISHEK commented 4 years ago

@Saini-96 I am also getting Same Error.

talhabu commented 2 years ago

Is there any effective solution?

llSourcell / linear_regression_live

After 1000 iterations b = nan, m = nan, error = nan #16

The optimal values of m and b can be actually calculated with way less effort than doing a linear regression.

this is just to demonstrate gradient descent

y = mx + b

m is slope, b is y-intercept