Brody-Lab / jbreda_animal_training

Repository for the ingestion, cleaning and visualization of behavioral experiments
0 stars 0 forks source link

Integrate SessionAgg Table #49

Closed jess-breda closed 2 months ago

jess-breda commented 2 months ago

Goal: integrate datajoint SessionAgg into days_df rather than creating it daily with inefficient merges

Steps:

jess-breda commented 2 months ago

Had a good push today here are the next steps:

jess-breda commented 2 months ago

Current issue:

SessionAgg table does not update in real time like the other tables.

For example, C222 has session data from 2024-07-04 (bc they accidentally ran over midnight)

image

C222 has mass data from 2024-07-03

image

Yet, the latest data in the session agg is from 2024-07-02

This is problematic for when I want to make daily summary visuals that take into account the most recent sessions as well as when using the SessionAgg table for determining the daily water plot for the session-based mega plot.

I sent a message to Alvaro about this to discuss options.

Other than this issue, integration into the current code base with animal checks works well. I am not willing to make the full switch quite yet because I need up-to-date information in the plots.

Once this is complete, I can re-write the fetch_and_format_single_day_water to use the SessionAgg and get rid of the subfunctions:

def fetch_and_format_single_day_water(animal_id, date, verbose=False):
    """
    wrapper function to fetch mass and watering info for a single day
    and return the information in a dataframe for easy plotting
    wrt plot_trials_info/plot_watering_amounts()

    params
    ------
    animal_id: str, e.g. 'R610'
        id of animal of which to fetch data for from mass and water tables
    date: str, e.g. '2021-04-15'
        date of which to fetch data for from mass and water tables
    verbose: bool, default False
        whether to print out information about the animal's mass and water
        consumption

    returns
    -------
    df: pd.DataFrame
        dataframe with columns 'date', 'rig_volume', 'pub_volume' to make
        a stacked bar chart with (this is the preferred format for plotting)
    volume_target: float
        target volume of water to be consumed by the animal on given day to
        mark on plot (i.e. the minimum threshold)
    """
    # TODO make this come from days_df once it is created up to date!!
    mass, _ = fetch_day_mass(animal_id, date)
    percent_target = fetch_day_restriction_target(animal_id, date)
    pub_volume = fetch_pub_volume(animal_id, date)
    rig_volume = fetch_rig_volume(animal_id, date, verbose=verbose)
    volume_target = fetch_day_water_target(mass, percent_target, date, verbose=verbose)

    df = pd.DataFrame(
        {"date": [date], "rig_volume": [rig_volume], "pub_volume": [float(pub_volume)]}
    )
    return df, volume_target
jess-breda commented 2 months ago

Forgot Alvaro had already sent info and wrote code on how to deal with day-of queries to this table. Integrated these functions to make the table on the fly for today's data if queries in data joint utils.

Tested with current check files and all seems to be working well 👍