janboone / applied-economics

course material for the course applied economics
17 stars 21 forks source link

Python adds an additional .0 to my Year column when I import it #54

Closed 211sk closed 2 years ago

211sk commented 2 years ago

image

I imported the following dataset, and in the dataset, the year values do not contain ".0" I tried the following code which are present in # but none of them seem to work. I want to merge the year and country variable together to create a column index4 which looks like AFG2018 not AFG2018.0

Thank you

janboone commented 2 years ago

you need to tell pandas that Year is date (not a real number): pd.to_datetime(sae.Year, format='%Y')

there is no need to merge two columns into one to get an index. You can make the combination of two columns an index. See the example here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.set_index.html

with df.set_index(['year', 'month'])