cmudig / AutoProfiler

Automatically profile dataframes in the Jupyter sidebar
BSD 3-Clause "New" or "Revised" License
339 stars 11 forks source link

Track dataframe manually through python expression #68

Open willeppy opened 1 year ago

willeppy commented 1 year ago

Right now, I can only profile dataframes that exist in memory, however I would like to be able to track any expression that returns a dataframe

Example:

list_of_dfs = []
for i in range(4):
    df_part_a = ...
    df_part_b = ...
    df_part_c = ...
    list_of_dfs.append(pd.concat([df_part_a, df_part_b, df_part_c]))

I want to track list_of_dfs[0] but this is impossible for now

willeppy commented 1 year ago

One potential design is to have API in notebook like below

import digautoprofile

digautoprofile.track(list_of_dfs[0])

and then at each timestep it evaluates this expression and shows a profile.

A couple of tricky cases

  1. Is it possible to even get the expression when calling a function in python without having to pass it as a string
  2. If we just pass the object then any updates wont register (because object id will likely change)