Open mohanbabu27 opened 5 years ago
Hi @mohanbabu27 , Could you please indicate the line that causes the exception? Also, I see a few differences with the code in the notebook:
gdp_per_capita.csv
file is ","
instead of "\t"
, so please make sure the file is indeed comma separated instead of tab separated.datasets/lifesat
folder)prepare_country_stats()
function is different (but if I remember correctly I just got rid of a few countries that did not follow the trend, to illustrate that a model will end up biased if the data is biased)Hope this helps.
Try this ...
import pandas as pd
import numpy as np
import sklearn.linear_model
import sklearn.neighbors
import matplotlib.pyplot as plt
# Load the data
bli = pd.read_csv("BLI2015.csv", thousands=',')
gdp = pd.read_csv("GDP.csv",thousands=',',delimiter=',',
encoding='latin1', na_values="n/a")
# Prepare the data
bli = bli[bli["INEQUALITY"]=="TOT"]
bli = bli[bli["INDICATOR"]=="SW_LIFS"]
nbli = pd.DataFrame(columns=['Pais','Satisfaccion'])
nbli['Pais']=bli['Country']
nbli['Satisfaccion']=bli['Value']
nbli.set_index("Pais", inplace=True)
ngdp = pd.DataFrame(columns=['Pais','Renta'])
ngdp['Pais']=gdp['Country']
ngdp['Renta']=gdp['2015']
ngdp.set_index("Pais", inplace=True)
ngdp = ngdp.apply(lambda x: pd.to_numeric(x.astype(str).str.replace(',',''), errors='coerce'))
country_stats = pd.merge(left=nbli, right=ngdp,
left_index=True, right_index=True)
country_stats.sort_values(by="Renta", inplace=True)
# Visualize the data
country_stats.plot(kind='scatter', x="Renta", y='Satisfaccion')
plt.show()
X = np.c_[country_stats["Renta"]]
y = np.c_[country_stats["Satisfaccion"]]
# Select a linear model (Opcional)
model = sklearn.linear_model.LinearRegression()
model = sklearn.neighbors.KNeighborsRegressor(n_neighbors=3)
# Train the model
model.fit(X, y)
# Make a prediction for Cyprus
X_new = [[22587]] # Cyprus' GDP per capita
print(model.predict(X_new))
I have tried all my level best to run the code from CH1. Copied as it as but getting different errors every time. Fixed many erros and but stuck at this errors. Can one of you help please.. KeyError: "['GDP per capita'] not in index"
Code below