datajoint-company / U19-Data-Viewer

0 stars 2 forks source link

Migration to `sci-viz` #10

Closed kabilar closed 1 year ago

kabilar commented 2 years ago

Summary

Migrate Princeton BrainCoGS Data Viewer to the Sci-Viz platform.

Current Data Viewer

data_viewer_overview

data_viewer_subject

data_viewer_session

Requirements

  1. View plots and table on the same page.
  2. Add Princeton-specific plots.
  3. Filter entries based on attributes.
    • Is there an ability to save the filter settings/card?
  4. Conditional highlighting of entries based on a backend query.
  5. Select an entry and update the plots.
  6. Deploy with a docker compose and config file.
  7. Determine routing and certificates for website.
  8. Server requirements for SciViz (4GB RAM, 2 cores).

Time estimation

  1. Approximately 4 hours to add this feature.
  2. Around 1 hour to config SciViz to show Princeton's data. Plus an additional few hours since the plots are not stored as plotly json in the database. Need to generate the plotly json.
    • For each plot, @GeetikaSi
      • Display the image
      • List the query used to generate each curve.
      • Convert the the Bokeh command used to generate each curve to Plotly.
      • Plotly json objects are not stored in a table, but are generated as restrictions are updated.
  3. Complete
  4. Minimal time to add query.
  5. Complete
  6. If DataJoint is hosting, around 1 hour to deploy. Deployment on Princeton infrastructure would take longer.
  7. Speak with PNI help.
  8. Speak with PNI help.
GeetikaSi commented 2 years ago

List of tables used to fetch data for data viewer

kabilar commented 2 years ago

Hi @GeetikaSi, I have edited my comment above after discussion with Jeroen. Please see bullet point 2. Thanks.

GeetikaSi commented 2 years ago

Hi @kabilar, Thank you for adding the screenshots of the plots, I'll add the queries by today evening.

GeetikaSi commented 2 years ago

I. Weight water plot

Screen Shot 2022-01-25 at 12 54 09 AM

DataJoint query

def get_water_data(key):
    water_info = (action.WaterAdministration & key).fetch(format='frame').reset_index()
    data_water = pd.DataFrame({'water_dates': water_info['administration_date'],
                               'earned'     : water_info['earned'],
                               'supplement' : water_info['supplement']})
    data_water['water_dates'] = pd.to_datetime(data_water['water_dates'])
    data_water = data_water.fillna(0)
    return data_water

def get_weight_data(key):
    weight_info = (action.Weighing.proj('weight') &
                   key).fetch(format='frame').reset_index()
    data_weight = pd.DataFrame({'weighing_dates': weight_info['weighing_time'],
                                'weight'        : weight_info['weight']})
    data_weight['weighing_dates'] = pd.to_datetime(data_weight['weighing_dates'])
    return data_weight

Plotting code

p = figure(x_axis_type="datetime", plot_width=600, plot_height=300, title='Water and Weight',
           x_axis_label='Date',
           y_axis_label='Water Intake [mL]')

p.xaxis.formatter = DatetimeTickFormatter(days='%m/%d/%y')

p.y_range = Range1d(0, 5)

water_plot = p.vbar_stack(water_methods, x='water_dates',
                          width=datetime.timedelta(days=0.4),
                          color=colors, source=data_water,
                          legend_label=water_methods)
p.xgrid.grid_line_color = None
p.outline_line_color = None
p.legend.location = (20, 180)
p.legend.orientation = "horizontal"

p.extra_y_ranges['weight'] = Range1d(20, 35)
p.add_layout(LinearAxis(y_range_name="weight",
                        axis_label='Weight [g]'), 'right')

weight_plot = p.scatter(x='weighing_dates', y='weight',
                        y_range_name="weight",
                        source=data_weight, color='black')

return p, [(water_plot, get_water_data, None),
           (weight_plot, get_weight_data, None)]

II. Performance, trial counts, and task level

Screen Shot 2022-01-25 at 1 16 42 AM

DataJoint query

def get_data(key):

    task = (dj.U('task') & (acquisition.Session & key)).fetch1('task')
    if task == 'AirPuffs':
        q = (acquisition.Session & key).aggr(puffs.PuffsSession.Trial.proj(), n_trials='count(*)') * acquisition.Session
    else:
        q = (acquisition.Session & key).proj(..., n_trials='num_trials')

    if len(q):
        performance_info = q.fetch(
            format='frame').reset_index()

        data_performance = pd.DataFrame({
            'session_dates': performance_info['session_date'],
            'performance'  : performance_info['session_performance'],
            'level'        : performance_info['level'],
            'n_trials'     : performance_info['n_trials']})
        data_performance['session_dates'] = pd.to_datetime(data_performance['session_dates'])
    else:
        data_performance = {
            'session_dates': [np.nan],
            'performance'  : [np.nan],
            'level'        : [np.nan],
            'n_trials'     : [np.nan]}

    return data_performance

Plotting code

p = figure(x_axis_type="datetime", plot_width=600, plot_height=300, title='Performance, trial counts, and task level', x_axis_label='Date', y_axis_label='Task level', y_axis_location='right')

p.xaxis.formatter = DatetimeTickFormatter(days='%m/%d/%y')

p.y_range = Range1d(0, max([max(data_performance['level']), 10]), min_interval=2)
level_plot = p.vbar(
    x='session_dates', top='level',
    source=data_performance, color='lightblue',
    legend_label='Level', width=datetime.timedelta(days=0.4))

p.extra_y_ranges['performance'] = Range1d(0, 200)
p.extra_y_ranges['n_trials'] = Range1d(0, max([400, max(data_performance['n_trials'])]))

p.add_layout(LinearAxis(y_range_name="performance",
                        axis_label='Performance [%]'), 'left')
p.add_layout(LinearAxis(y_range_name="n_trials",
                        axis_label='Trial counts'), 'left')

performance_plot_line = p.line(
    x='session_dates', y='performance', y_range_name='performance',
    source=data_performance, color='gray', legend_label='Performance')

performance_plot_dot = p.scatter(
    x='session_dates', y='performance', y_range_name='performance',
    source=data_performance, color='black', legend_label='Performance')

trial_counts_plot_line = p.line(
    x='session_dates', y='n_trials', y_range_name='n_trials',
    source=data_performance, color='pink', legend_label='Trial counts')

trial_counts_plot_dot = p.scatter(
    x='session_dates', y='n_trials', y_range_name='n_trials',
    source=data_performance, color='red', legend_label='Trial counts')

p.xgrid.grid_line_color = None
p.outline_line_color = None

p.legend.location = (290, 10)

return p, [(performance_plot_line, get_data, update_view),
           (performance_plot_dot, get_data, update_view),
           (trial_counts_plot_line, get_data, update_view),
           (trial_counts_plot_dot, get_data, update_view),
           (level_plot, get_data, update_view)]

III. Psychometric curve

Screen Shot 2022-01-25 at 1 22 41 AM

DataJoint query

def get_psych_data(key, blocks_type=None):

    data = default_data
    q = create_query(key, blocks_type)
    if len(q):
        psych = q.fetch1()
        delta_data = psych['blocks_delta_data'] if blocks_type else psych['session_delta_data']
        pright_data = psych['blocks_pright_data'] if blocks_type else psych['session_pright_data']

        if delta_data is None or type(delta_data) is float:
            return data
        data = {'x': np.squeeze(delta_data).tolist(),
                'y': np.squeeze(pright_data).tolist()}
        if type(data['x']) is float:
            data['x'] = [data['x']]

        if type(data['y']) is float:
            data['y'] = [data['y']]

    return data

def get_psych_error(key, blocks_type=None):

    data = default_data
    q = create_query(key, blocks_type)
    if len(q):
        psych = q.fetch1()
        delta_error = psych['blocks_delta_error'] if blocks_type else psych['session_delta_error']
        pright_error = psych['blocks_pright_error'] if blocks_type else psych['session_pright_error']

        if delta_error is None or type(delta_error) is float:
            return data
        data = {'x': np.squeeze(delta_error).tolist(),
                'y': np.squeeze(pright_error).tolist()}

    return data

def get_psych_fit(key, blocks_type=None):

    data = default_data
    q = create_query(key, blocks_type)
    if len(q):
        psych = q.fetch1()
        delta_fit = psych['blocks_delta_fit'] if blocks_type else psych['session_delta_fit']
        pright_fit = psych['blocks_pright_fit'] if blocks_type else psych['session_pright_fit']
        if delta_fit is None or type(delta_fit) is float:
            return data
        data = {'x': np.squeeze(delta_fit).tolist(),
                'y': np.squeeze(pright_fit).tolist()}
    return data

Plotting code

 def psych_curve(psych_data, psych_error, psych_fit, title, label=None):
      p = figure(plot_width=550, plot_height=300,
                 title=title,
                 x_axis_label='#R - #L',
                 y_axis_label='% went R')

      p.y_range = Range1d(0, 100)

      fit_plot = p.line(x='x', y='y',
                        source=psych_fit, color='gray', legend_label='Fit')

      error_plot = p.line(x='x', y='y',
                          source=psych_error, color='gray', legend_label='Data')

      data_plot = p.scatter(x='x', y='y',
                            source=psych_data, color='black', legend_label='Data')

      if label:
          subject_label = Label(x_offset=-180, y_offset=200, text=label, text_font_size='9pt')
          p.add_layout(subject_label)

      p.xgrid.grid_line_color = None
      p.outline_line_color = None

      p.legend.location = (330, 10)

      if label:
          return p, [data_plot, error_plot, fit_plot, subject_label]
      else:
          return p, [data_plot, error_plot, fit_plot]
kabilar commented 2 years ago

Thank you @GeetikaSi!

kabilar commented 1 year ago

Closing as this migration is not currently a priority.