AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
340 stars 150 forks source link

add more information to ophys_cells_table cache/manifest table #2543

Closed DowntonCrabby closed 1 year ago

DowntonCrabby commented 1 year ago

Describe the use case that is addressed by this feature. Currently the ophys_cells_table (cache/manifest summary table) contains very little information. Right now it only contains the following columns:

while this is a good place to start, it would be extremely useful to add more information so external have a summary table that has enough information for them to filter data and get a sense for how many cells meet their analysis criteria.

Describe the solution you'd like We would like the following information from the ophys_experiments_table (cache/manifest summary table) added to the ophys_cells_table

We would like to add information from the cell_specimen_table (an ophys_experiment attribute) to the ophys_cells_table (cache/manifest summary table). Specifically we would like to add the following columns:

Additional context For updating the documentation here are all the column names, types and descriptions for the newly added columns:

Additionally, I will be opening a ticket about imaging_depth shortly and will link to it in the comments when it's opened, but we would like to use the averaged depth for a container rather than individual ophys experiment depths

Do you want to work on this issue? I will be out of office so probably not the most reliable person to work on this.

DowntonCrabby commented 1 year ago

tagging @matchings so she can follow

morriscb commented 1 year ago

Hey @DowntonCrabby, I was able to add the add the ROI position/size information to the table. Pika's stance on adding the other columns is that as they are already available in the other table and can be merged into this one by the user so we are hesitant to duplicate the information. This would be adding data not specific to a given ROI into an ROI table and we'd like to limit the number of times we potentially put extra data into a table. When we update the notebooks for the new release, we can include an example of merging the data from the experiment pandas table into the ROI table.

Additionally, doesn't the existence the ophys_cells_table defeat the need for issue #2544? The ROIs returned in this table are required to be valid so the value you are asking for in that ticket is the number of ROIs per experiment in this table. If we put an example of this calculation in the notebooks, would that suffice?

matchings commented 1 year ago

@morriscb I am fine with keeping the columns limited to ones that are unique to cell ROIs. I will say that @DowntonCrabby felt that including them was helpful for users and made things more accessible. But I think including a clearly documented example of merging it with the other tables to get the relevant information is a good compromise.

I don't think the existance of the ophys_cells_table invalidates #2544. It would be very useful for users to have quick access to that information as something they could sort or filter the ophys_experiment_table by. For example, when selecting experiments to use for a population decoding analysis, which requires a certain number of neurons to be meaningful. There are additional use cases that I won't go into, but I think there are plenty of reasons to consider the number of neurons as a piece of metadata that is used for data selection, rather than as something you can compute from a table.

aamster commented 1 year ago

@matchings the number of cells per experiment is something easily calculable from the ophys_cells_table. I am with @morriscb that the onus should be on the user to do this calculation if they need it for their analysis. We are giving them all necessary data to calculate this.

morriscb commented 1 year ago

Hi folks, I added the x,y,width,heigh columns and merged them into the release candidate branch. I'll ost a summary of Adam and I's feelings on #2544 and we can continue the discussion there.