AllenInstitute / coupledAE-patchseq

Multimodal data alignment and cell type analysis with coupled autoencoders.
Other
8 stars 1 forks source link

Request for Guidance on Extracting Electrophysiological Data from NWB Files Related to CoupledAE-Patchseq #5

Open beilouer opened 1 month ago

beilouer commented 1 month ago

Dear Developers,

I came across several important files while reading the research literature on coupledAE-patchseq, including ipfx_features.csv, sparse_pca_components_mMET_revision_Apr2020_v2.csv, and spca_loadings_mMET_revision_Apr2020_v2.pkl. I believe these files are related to the 44 sparse principal components and 24 intrinsic physiological features mentioned in the literature.

I would like to know how to analyze the original electrophysiological data from the NWB files to obtain these data. The original text mentions that these data can be analyzed using the IPFX library. Could you provide more detailed corresponding code scripts to assist me in the analysis?

Thank you for your help and guidance!

rhngla commented 1 month ago

Hello - thanks for your interest in our work. Please see comments this related issue.

beilouer commented 1 month ago

@rhngla Thank you very much for your help. I now have a better understanding of how to analyze the SPC. However, I still have one question regarding the 24 features mentioned in the article. I used the command python -m ipfx.bin.run_pipeline_from_nwb_file ,which comes from the IPFX library, and generated the output.json file. After reviewing the data_proc_E.ipynb file, I found that the ipfx_feature.csv contains the following 24 features: ['ap_1_threshold_v_short_square', 'ap_1_peak_v_short_square', 'ap_1_upstroke_short_square', 'ap_1_downstroke_short_square', 'ap_1_upstroke_downstroke_ratio_short_square', 'ap_1_width_short_square', 'ap_1_fast_trough_v_short_square', 'short_square_current', 'input_resistance', 'tau', 'v_baseline', 'sag_nearest_minus_100', 'sag_measured_at', 'rheobase_i', 'ap_1_threshold_v_0_long_square', 'ap_1_peak_v_0_long_square', 'ap_1_upstroke_0_long_square', 'ap_1_downstroke_0_long_square', 'ap_1_upstroke_downstroke_ratio_0_long_square', 'ap_1_width_0_long_square', 'ap_1_fast_trough_v_0_long_square', 'avg_rate_0_long_square', 'latency_0_long_square', 'stimulus_amplitude_0_long_square']However, I am finding it difficult to directly match many of these features to the output.json file,which is generated from the IPFX pipeline. For instance, 'ap_1' does not appear in the file, and certain variables, such as 'width', correspond to multiple sweeps. I am unsure which sweep I should select in these cases. Could you kindly offer guidance on how to resolve this issue? I would greatly appreciate your assistance. Thank you very much!

beilouer commented 1 month ago

@rhngla I apologize for disturbing you again. After carefully reviewing the cell_record section of the output.json file obtained from the IPFX package, I have made some guesses regarding the correspondence between the 24 electrophysiological features mentioned in the paper and the variables in the cell_record section: ap_1_threshold_v_short_square** corresponds to threshold_v_short_square ap_1_peak_v_short_square corresponds to peak_v_short_square ap_1_upstroke_short_square corresponds to upstroke_downstroke_ratio_short_square ap_1_downstroke_short_square corresponds to trough_v_short_square ap_1_upstroke_downstroke_ratio_short_square corresponds to upstroke_downstroke_ratio_short_square ap_1_width_short_square doesn't have a direct match, but it can be calculated as: ap_1_width_short_square = peak_t_short_square − threshold_t_short_square ap_1_fast_trough_v_short_square corresponds to fast_trough_v_short_square ap_1_threshold_v_0_long_square corresponds to threshold_v_long_square ap_1_peak_v_0_long_square corresponds to peak_v_long_square ap_1_upstroke_0_long_square corresponds to upstroke_downstroke_ratio_long_square ap_1_downstroke_0_long_square corresponds to trough_v_long_square ap_1_upstroke_downstroke_ratio_0_long_square corresponds to upstroke_downstroke_ratio_long_square ap_1_width_0_long_square doesn't have a direct match, but it can be calculated as: ap_1_width_long_square = peak_t_long_square − threshold_t_long_square ap_1_fast_trough_v_0_long_square corresponds to fast_trough_v_long_square avg_rate_0_long_square corresponds to multiple sweeps, and I am unsure how to calculate it. latency_0_long_square corresponds to latency stimulus_amplitude_0_long_square corresponds to threshold_i_long_square short_square_current corresponds to threshold_i_short_square input_resistance corresponds to input_resistance_mohm tau corresponds to tau v_baseline corresponds to vrest sag_nearest_minus_100 correspond to sag
sag_measured_at correspond to vm_for_sag rheobase_i corresponds to threshold_i_long_square I am not sure if these matches are correct, and I would greatly appreciate your guidance in verifying them.And for some of the calculations that I am unsure of, could you please explain how to calculate them? Thank you very much for your time and help.

rhngla commented 1 month ago

Hello - the data for our study is included in this repository (.mat file); a re-analysis of the data from .nwb files is out of scope for this repository.

You could try directing your questions to ipfx package maintainers. You may also find discussion in this other related issue of use.

beilouer commented 2 weeks ago

@rhngla Thank you for your prompt response and for clarifying the scope of the repository. I appreciate your suggestion and will follow up with the ipfx package maintainers for further assistance. I will also check the related discussion you mentioned.

And I have another question I'd like to ask for your advice on. I'm not sure if it's okay to ask.You may find related information in issue #6.Thank you!