AllenInstitute / visual_behavior_analysis

Python package for analyzing behavioral data for Brain Observatory: Visual Behavior
Other
21 stars 6 forks source link

Remove dependency on computer list in `devices` #509

Closed dougollerenshaw closed 5 years ago

dougollerenshaw commented 5 years ago

Currently, visual behavior relies on a hard-coded dictionary linking computer name to 'Rig ID'. The dictionary lives in 'devices': https://github.com/AllenInstitute/visual_behavior_analysis/blob/master/visual_behavior/devices.py

MPE is maintaining a list of computers and rig IDs in a network location. We should use that list instead. I'll follow up with a link to the MPE location in a comment.

nicain commented 5 years ago

All Some of this data is modeled in LIMS, and I am accessing it directly with the code used to build the NWB files for Visual Behavior Ophys sessions. If there is a need for this information, I can try and build a robust access-point (api endpoint) in allensdk. Let me know if this would be of use. My ideal scenario would be to review a PR onto allensdk by someone interested in implementing this (I can help if need-be), but I want us to get out of the habit of scraping files for data/metadata, and start using our database system directly.

dougollerenshaw commented 5 years ago

Getting it from LIMS would be great if possible. I agree that relying on existing infrastructure is ideal.

dougollerenshaw commented 5 years ago

Just to get it in the same place, here's an email thread from January that died out:

That seems like an entirely reasonable approach. If it works for everyone else, please go ahead and submit a work request via the Instrumentation SharePoint page.

Colin Farrell Director, Manufacturing & Process Engineering

From: Nicholas Cain Sent: Monday, January 21, 2019 12:37 PM To: Colin Farrell colinf@alleninstitute.org Cc: Wayne Wakeman waynew@alleninstitute.org; Doug Ollerenshaw dougo@alleninstitute.org; David Feng davidf@alleninstitute.org Subject: Re: Accessing sharepoint programmatically as a dependency of a python repo:

One option is to make a foraging2 feature request to put aibs_rig_id and/or aibs_comp_id in the pickle file, and then I update visual_behavior_analysis to utilize this new feature. Then we don’t have to depend on any other databases.

Colin, how does this sound to you?

Best, -n

On Jan 21, 2019, at 12:33 PM, Colin Farrell colinf@alleninstitute.org wrote:

As I understand it, the only way to get to SharePoint information is with a username and password. I can work with the IT team to get a read-only user for whom we can make ‘public’ the password, however, it might be best if we got this information to you in another way. We always know from which rig we are uploading data. Let me know if we need to do something different at data-upload time.

Colin.

Colin Farrell Director, Manufacturing & Process Engineering

From: Wayne Wakeman Sent: Monday, January 21, 2019 12:16 PM To: Nicholas Cain nicholasc@alleninstitute.org; Colin Farrell colinf@alleninstitute.org Cc: Doug Ollerenshaw dougo@alleninstitute.org; David Feng davidf@alleninstitute.org Subject: Re: Accessing sharepoint programmatically as a dependency of a python repo:

Colin, do you know any programmatic way to access this table? From: Nicholas Cain Sent: Monday, January 21, 2019 12:09:47 PM To: Wayne Wakeman; David Feng Cc: Doug Ollerenshaw Subject: Accessing sharepoint programmatically as a dependency of a python repo:

MPE has informed Doug and I about a sharepoint page that contains a table relating aibs_rig_id to ComputerName:

https://alleninstitute.sharepoint.com/sites/Instrumentation/Lists/AIBSManagedComputers/Allitems.aspx

Turns out that this is a relationship necessary for identifying where a given session was collected from in visual_behavior_analysis.

Right now this is hard-coded in a .py file in the repo, but it would be nice to read this programmatically. Are you aware of an open API that is available to access sharepoint data (there is a python package that can access the data, but seems to require a username and password).

Best, -n

dougollerenshaw commented 5 years ago

Also, my motivation for pushing on this again is that MPE is about to update all pipeline computers to Windows 10. The computer IDs will all change at the same time. Any infrastructure depending on the hard-coded computer name dictionary in devices.py will break.

nicain commented 5 years ago

@dougollerenshaw I was incorrect; "rig" for a behavior session in LIMS corresponds to cluster, so I cannot access computer id from a LIMS call. For reference, this is what I was thinking:

SELECT equipment.name FROM behavior_sessions
LEFT JOIN equipment on equipment.id = behavior_sessions.equipment_id
nicain commented 5 years ago

Sharepoint issue that resolves aibs_rig_id and aibs_computer_id

https://web.powerapps.com/webplayer/app?hidenavbar=true&RequestID=47&appId=%2fproviders%2fMicrosoft.PowerApps%2fapps%2fbe88d2e1-5c4c-41ba-9757-cf5d85ea99e0

dougollerenshaw commented 5 years ago

Passing off to @nickponvert.

nickponvert commented 5 years ago

I checked some recent pickle files and only ever saw the 'rig_id' field as 'unknown'. Here is an example from 2019-04-02 collected on rig 2P.4 (which should meet the definition of a production system):

\allen\programs\braintv\production\visualbehavior\prod0\specimen_814111935\ophys_session_844465368\844465368_stim.pkl

When I load this file and look at the 'platform_info' field, here is what I am getting:

In [108]: data['platform_info'] Out[108]: {'hardware': ('Intel64 Family 6 Model 63 Stepping 2, GenuineIntel', 'AMD64'), 'camstim': '0.5.1', 'pyglet': '1.2.4', 'computer_name': 'W7DT2P4STIM', 'opengl': '4.6.0 NVIDIA 411.63', 'python': '2.7.13', 'rig_id': 'unknown', 'os': ('Windows', '7', '6.1.7601'), 'psychopy': '1.82.01', 'camstim_git_hash': None}

Not totally sure I am looking in the right place in the pickle file for the rig_id. I added a comment on the resolved work request with this same information.

https://web.powerapps.com/webplayer/app?hidenavbar=true&RequestID=47&appId=%2fproviders%2fMicrosoft.PowerApps%2fapps%2fbe88d2e1-5c4c-41ba-9757-cf5d85ea99e0

nicain commented 5 years ago

@rhytnen Any idea if this data should be written by 2p4 @nickponvert Try one from 2p5

nickponvert commented 5 years ago

The ['platform_info']['rig_id'] field is 'unknown' for pkl files generated on 2019-04-02 from:

2P5 (\allen\programs\braintv\production\visualbehavior\prod0\specimen_803314654\ophys_session_844460028)

2P3 (\allen\programs\braintv\production\visualbehavior\prod0\specimen_813703544\ophys_session_844468346)

and also from one of the behavior rigs collected on 2019-04-03 (B2 on MouseSeeks, BEH.B on LIMS) (\allen\programs\braintv\production\neuralcoding\prod0\specimen_823836023\behavior_session_845588229)

nickponvert commented 5 years ago

Camstim version 5.1 is now saving rig_id and comp_id as top level fields in the behavior pickle files, but not in the ophys pickle files. I didn't know about the behavior pickle files, and so I was just looking in the ophys xxxxx_stim.pkl files. Both files do contain the 'platform_info' field, which contains 'rig_id', and in both file types this field is currently pointing at the wrong environment variable, so they are currently 'unknown'.

In [5]: odata['start_time']
Out[5]: datetime.datetime(2019, 4, 10, 13, 34, 39, 264000)

In [6]: odata['platform_info']
Out[6]: {'hardware': ('Intel64 Family 6 Model 63 Stepping 2, GenuineIntel', 'AMD64'), 'camstim': '0.5.1', 'pyglet': '1.2.4', 'computer_name': 'W7DT2P3STiM', 'opengl': '4.6.0 NVIDIA 390.77', 'python': '2.7.13', 'rig_id': 'unknown', 'os': ('Windows', '7', '6.1.7601'), 'psychopy': '1.82.01', 'camstim_git_hash': None}


- Behavior  pickle file (/allen/programs/braintv/production/visualbehavior/prod0/specimen_820878213/behavior_session_849147676/848894137.pkl)
```python
In [15]: data.keys()                                                                   
Out[15]: dict_keys(['comp_id', 'unpickleable', 'items', 'start_time', 'script', 'rig_id', 'threads', 'stop_time', 'session_uuid', 'platform_info'])

In [16]: data['start_time']                                                            
Out[16]: datetime.datetime(2019, 4, 10, 13, 34, 39, 264000)

In [17]: data['rig_id']                                                                
Out[17]: 'CAM2P.3'

In [18]: data['comp_id']                                                               
Out[18]: 'CAM2P.3-STIM'

In [19]: data['platform_info']                                                         
Out[19]: 
{'camstim': '0.5.1',
 'opengl': '4.6.0 NVIDIA 390.77',
 'python': '2.7.13',
 'rig_id': 'unknown',
 'hardware': ('Intel64 Family 6 Model 63 Stepping 2, GenuineIntel', 'AMD64'),
 'pyglet': '1.2.4',
 'computer_name': 'W7DT2P3STiM',
 'os': ('Windows', '7', '6.1.7601'),
 'psychopy': '1.82.01',
 'camstim_git_hash': None}
nickponvert commented 5 years ago

We should be aware of the following inconsistencies (this is after un-pickling a behavior PKL file)

In [17]: data['rig_id']                                                                
Out[17]: 'CAM2P.3'

In [18]: from visual_behavior import devices                                           

In [19]: data['comp_id']                                                               
Out[19]: 'CAM2P.3-STIM'

In [20]: data['platform_info']['computer_name']                                        
Out[20]: 'W7DT2P3STiM'

In [21]: comp = data['platform_info']['computer_name']                                 

In [22]: devices.get_rig_id(comp)                                                      
Out[22]: '2P3'
dougollerenshaw commented 5 years ago

fixed by PR #531