Closed Lumik7 closed 6 years ago
@Lumik7 Are you on this already? Otherwise I'd do it tomorrow.
@rmitsch No I did nothing except for the fix mentioned above. It would be great if you systematically check the code for none values. Either by introducing None checks or replacement of pd.DataFrames. The goal should be that we do not have to worry about None pd.DataFrames after the preprocessing step.
@Lumik7 Alright, will do. I'll handle #12 as well if that's ok for you, since it's pretty much the same problem.
@rmitsch yes, that's fine, but I don't think that is an issue anymore because we do not use it in the preprocessing anymore --> it got replaced by paa
@Lumik7 That's true. I suggest closing #12 as wont-fix then.
yes, go ahead
Added replace_none_values_with_empty_dataframes(dataframe_dicts: list)
in 326664c. Applied it after every preprocessing step. E. g.:
# 2. Remove trips less than 10 minutes long.
dfs = Preprocessor.replace_none_values_with_empty_dataframes(
Preprocessor._remove_dataframes_by_duration_limit(dfs, 10 * 60)
)
Please check whether the results conforms to your expectations. If so, I'll merge to master and close the issue.
I think this:
{
key: pd.DataFrame() if df_dict[key] is None else df_dict[key]
for key in df_dict
} for df_dict in dataframe_dicts
code snippet will introduce some problems, because empty data frames will introduce key errors as there are no column names. To be safe the correct column names should be added for the empty DataFrames.
Might be the case. Will do.
Replaced with
{
key: pd.DataFrame(columns=Preprocessor.DATAFRAME_COLUMN_NAMES[key])
if df_dict[key] is None else df_dict[key]
for key in df_dict
} for df_dict in dataframe_dicts
where
DATAFRAME_COLUMN_NAMES = {
"cell": ['time', 'cid', 'lac', 'asu'],
"annotation": ['time', 'mode', 'notes'],
"location": ['time', 'gpstime', 'provider', 'longitude', 'latitude', 'altitude', 'speed', 'bearing',
'accuracy'],
"sensor": ['sensor', 'time', 'x', 'y', 'z', 'total'],
"mac": ['time', 'ssid', 'level'],
"marker": ['time', 'marker'],
"event": ['time', 'event', 'state']
}
I can't (re-)produce an error, so I'm not sure whether this will solve the problem, but it ought too. If you agree, I'll merge into master.
Yes, I agree
Am 18.12.2017 1:27 nachm. schrieb "Raphael Mitsch" <notifications@github.com
:
Replaced with
{ key: pd.DataFrame(columns=Preprocessor.DATAFRAME_COLUMN_NAMES[key]) if df_dict[key] is None else df_dict[key] for key in df_dict } for df_dict in dataframe_dicts
where
DATAFRAME_COLUMN_NAMES = { "cell": ['time', 'cid', 'lac', 'asu'], "annotation": ['time', 'mode', 'notes'], "location": ['time', 'gpstime', 'provider', 'longitude', 'latitude', 'altitude', 'speed', 'bearing', 'accuracy'], "sensor": ['sensor', 'time', 'x', 'y', 'z', 'total'], "mac": ['time', 'ssid', 'level'], "marker": ['time', 'marker'], "event": ['time', 'event', 'state'] }
I can't (re-)produce an error, so I'm not sure whether this will solve the problem, but it ought too. If you agree, I'll merge into master.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/univie-datamining-team3/assignment2/issues/11#issuecomment-352412111, or mute the thread https://github.com/notifications/unsubscribe-auth/AOiGfVPpodEXQm0AT0TDNg5tG3tYfcW8ks5tBlorgaJpZM4Q_bC_ .
Merged.
It seems some dataframes are set to None in case the recording did not work e.g. for token "KEY_LUKAS" trip number 17, table "location". I encountered this error when trying to call Preprocessor.convert_timestamps(df). I fixed this specific error already in this commit, but I think it would be better to replace None type dataframes during preprocessing() with an empty DataFrame.