LorenFrankLab / spyglass

Neuroscience data analysis framework for reproducible research built by Loren Frank Lab at UCSF
https://lorenfranklab.github.io/spyglass/
MIT License
94 stars 43 forks source link

A more pythonic user interface #529

Open khl02007 opened 1 year ago

khl02007 commented 1 year ago

Is your feature request related to a problem? Please describe. Most users who know python do not know the datajoint-specific syntax required to interact with the pipelines. A more pythonic, object-oriented interface could improve the user experience.

Example Ingesting an NWB file returns a spyglass session object:

import spyglass as sg
nwb_file = 'test.nwb'
session = sg.insert_session(nwb_file)

One can access everything about the session (e.g. information in the common table) as attributes of the session object. Calling these attributes prints the datajoint table. For example, session.electrode prints spyglass.common.Electrode & {'nwb_file_name': 'test.nwb'}.

To do analysis, set parameters by calling methods associated with the session object:

session.insert_electrode_group(electrode_group_name='tetrode1', electrode_ids=[1,2,3,4])
session.insert_interval(interval_name='epoch1', interval=[10,20])

and then run the computation

# this runs spyglass.lfp.LFP.populate() under the hood
session.compute_lfp(electrode_group_name='tetrode1', interval_name='epoch1', filter_param_name='default')
# see results 
print(session.lfp) # this prints spyglass.lfp.LFP &  {'nwb_file_name': 'test.nwb'}
# select a computed LFP with index and create visualization
session.lfp[0].generate_figurl()

Later, the user can reinstantiate the session object from the NWB file.

session = sg.load_session('test.nwb')

Advantages

Disadvantages

rly commented 1 year ago

I like this idea and think it would improve the experience for the average user who just wants to run code on their data like using a toolbox.

Another disadvantage is that it is more code to document, test, and maintain.

Related: https://github.com/LorenFrankLab/spyglass/issues/518

edeno commented 1 year ago

@CBroz1 this seemed to be related to some of your goals in the long/medium term. Could you comment here?

CBroz1 commented 1 year ago

I think of 3 categories of users...

  1. Power users who design pipelines and are required to learn the nuances of datajoint to do so. Likely won't need such additions.
  2. Analysis users who take advantage of an existing pipeline to run an analysis. Maybe they declare a few custom tables to do so. As mentioned above, novel queries/joins might be required here.
  3. Entry users who want the easiest way possible (often GUIs) to add session data. Simplified functions would be helpful. But are they approaching the experience with a sufficient python background that non-pythonic interfaces are the hang-up? In my experience, coding in general is the learning curve.

If my assumptions are correct above, I would start with the goal of making 'everyday use' notebooks for analysis and entry users covering data entry/queries and work backwards to whatever new methods would reduce the complexity of the notebooks as much as possible

khl02007 commented 1 year ago

I actually think users from categories 2 and 3 would benefit greatly from this. I can't remember an instance when I needed to do joins for my analysis, and I only do queries occasionally. From my perspective the primary benefit to using a relational database backend is provenance rather than the ability to do fancy database operations. Most users just want to run their NWB file through the workflows; queries are only useful if you start doing things like analyzing across multiple NWB files. I agree that for entry users, coding is the primary barrier, but maybe that's precisely why we should try to make the interface easier to pick up (yes I think pythonic = more intuitive = easier). For them, a table is a rather obscure thing to work with.