I am trying to combine two dataframes (~10 Million rows) on column timestamp and ID. Due to both dataframes dont have the same timestamp for same ID, i need to use a function called 'map_RSRP' to combine both using conditions. However below codes consume very very long time (few days time) to load the 10 Millions rows of dataframe using my personal notebook.
Is there any method/codes can be used to speed up the processing for such big dataframe?
I am trying to combine two dataframes (~10 Million rows) on column timestamp and ID. Due to both dataframes dont have the same timestamp for same ID, i need to use a function called 'map_RSRP' to combine both using conditions. However below codes consume very very long time (few days time) to load the 10 Millions rows of dataframe using my personal notebook.
Is there any method/codes can be used to speed up the processing for such big dataframe?
Dataframe RTT columns: ['TIMESTAMP', 'ID', 'RTT'] Dataframe RSRP columns: ['TIMESTAMP', 'ID', 'RSRP'] Result dataframe columns: ['TIMESTAMP', 'ID', 'RTT', 'RSRP']
Below are my codes: `` def map_RSRP(timestamp, id): rsrp = np.nan time_diff = np.nan
Result dataframe columns: ['TIMESTAMP', 'ID', 'RTT', 'RSRP']
df_rtt['RSRP'] = df_rtt.apply(lambda x : pd.Series(map_RSRP(x['TIMESTAMP'],x['ID'])), axis ``