Closed SophMC closed 8 years ago
Solution
Turn the groupby MultiIndex (consisting of [year[month]]) into two columns [year][month]:
wind_group = wind_group.reset_index()
Join year and month into one column with the format yyyy/mm and apply to_datetime for each row:
wind_group['date'] = wind_group[['year', 'month']].apply(lambda x: pd.to_datetime('/'.join(x)), axis=1)
Make the new datetimeIndex column ['date'] the dataFrame index:
wind_group.index=wind_group['date']
Actually lambda here is ok - the function is pretty simple. Main thing is readability - if the lambda isn't short and simple then use a separate def
.
I have taken the mean and std of ws grouped by year and month. This is the first 5 lines of the new dataframe which I stored the values in.
year month mean std 0 1984 3 5.060840 2.280239 1 1984 4 5.219944 1.490117 2 1984 5 5.595610 1.396281 3 1984 6 6.520606 1.927716 4 1984 7 6.095756 1.958878 year int64 month int64 dtype: object
I want to create a datetime object from the year and month columns and make it the index of the dataframe, preferably using:
pd.to_datetime
and preferably using only one line of code!These attempts don't work:
df.index=pd.to_datetime(year + month)
df.index=pd.to_datetime(df.year + df.month)
df['date'] = df.apply(lambda row: datetime(row['Year'], row['Month'], axis=1))
#with the intention to change it to the index after.df['date']= df.apply(lambda x:datetime.strptime("{0} {1}".format(x['year'],x['month']), "%Y%m"),axis=1)
Though i've already done something similar using
pd.to_datetime
which does work and I'm not sure why it does, as nothing above seems to. I suspect it is something to do with the new dataframe I created:wind['date_time'] = pd.to_datetime(wind.date_time) + wind.hour.astype('timedelta64[h]')
wind.date_time
is already a datetime object created by read_csv when I read the file in and tell it to parse the first 3 columns together.I can do it if I use a loop:
for i in range(0,len(df)):
year_n, month_n = df.index[i]
year_new.append(datetime(year_n,month_n,1))
But I guess that's slow and not the best use of pandas...