Error in FeatureExtraction for maxtwo data and few other questions.

mandarmp commented 8 months ago

You can assign this a label, "question", also it would be great to have a discussion page on the repo

Iam trying to understand the modules , but while trying to run featureextaraction on a sorted data from a maxtwo recording I faced this error, any insights on this:

BasicFeatureExtractionTutorial
Generated 1 sorting paths
Loaded recording information
Imported custom parameters
Found 1 units with bad power
Found 0 units with bad amplitudes
Found 13 units with bad firing rates
Found 15 units with bad RPV
Identified 31 units as axonal
Identified 65 units as noise
Found 680/785 good units
Index in position 1 is invalid. Array indices must be positive integers or logical values.

Error in MEArecording/inferWaveformFeatures (line 280)
            peak_1_cutout = interp_wf_matrix(unit_trough_idx - ms_conversion:unit_trough_idx,:); %Limit detection range for peaks

Error in MEArecording/generateUnits (line 613)
               waveform_features = obj.inferWaveformFeatures(good_amplitudes, good_wf_matrix);

Error in MEArecording/performAnalyses (line 130)
                obj.generateUnits();

Error in MEArecording (line 37)
                mearec.performAnalyses();

Error in generate_MEArecordings_from_sorting_list (line 50)
                    rec_array{iPath} = MEArecording(metadata, params);

Error in BasicFeatureExtractionTutorial (line 57)
rec_array = generate_MEArecordings_from_sorting_list(sorting_path_list, lookup_path, path_part_idx, params.QC.N_Units, params, parallel);

I initially suspected it was due to the maxtwo sampling but few other examples worked fine. any insights on these be great.

also I have a few more questions,

1) do you sort on the activity assays ( as in Silvia Ronchis paper)?

2) Why use KS 2.5, i was using KS2 since the cells are stationary.

3) When I went through your paper, I understood for a culture representative unit and waveforrm features were calculated, since we use mouse primary neurons, there is a mix of different cells, i want to segregate them into excitatory and inhibitory. Can I use your module to this, any thoughts on this. Thank you for your time.

mandarmp commented 8 months ago

I want to run it on a single recording and get the different cell type

unitFeatTable = getUnitFeatures(rec_array.obj,["ReferenceWaveform","ActivityFeatures"]);

numFeatures = size(unitFeatTable, 2); % 35 features
normalizedUnitFeatTable = unitFeatTable; % Initialize with original data

for i = 1:numFeatures
    meanVal = mean(unitFeatTable{:, i});
    stdVal = std(unitFeatTable{:, i});
    normalizedUnitFeatTable{:, i} = (unitFeatTable{:, i} - meanVal) / stdVal;
end
% Identify numeric columns
numericCols = varfun(@isnumeric, normalizedUnitFeatTable, 'OutputFormat', 'uniform');
% Convert only numeric columns to an array
numericDataMatrix = table2array(normalizedUnitFeatTable(:, numericCols));

[reduction, umap, clusterIdentifiers, extras] = run_umap(numericDataMatrix,'n_components',2,'n_neighbors',100,'min_dist',0.1,'cluster_detail','adaptive','spread',1,'sgd_tasks',20,...
                        'verbose','none');
figure;
gscatter(reduction(:,1), reduction(:,2), clusterIdentifiers);
title('UMAP Reduction with Cluster Identifiers');
xlabel('UMAP 1');
ylabel('UMAP 2');
grid on;

is this the right approach, I dont want to reivent the wheel. I suppose your routine , aggregates many cultures?

hornauerp commented 7 months ago

Regarding your error, I think I ran into similar problems before. Could you check your template matrix if any of the templates consist only of NaNs?

Regarding your questions:

No, I do not work with the activity scan. I either use the network recordings or the axon tracking assays.
Yes, for the normal use case 2.0 and 2.5 should be identical. However, there is the option to concatenate recordings from different time points to track units across hours/days (e.g. for pharmacological perturbations). Depending on the time scale, you might see some movement of the cells.
Ideally, this would be possible. However, to my knowledge, nobody has found robust enough differences between excitatory and inhibitory neurons to get a clear clustering result. But in general, yes, this is one of the applications of the single-cell module.

hornauerp commented 7 months ago

So for your second post, there are already some functions implemented that do what you are doing in your code. You can take a look at the RecordingGroup functions (you can also initialize it with only one recording):

reduceDimensionality: implements PCA, tSNE and UMAP on the feature groups selected by you with optional normalization
clusterByFeatures: allows clustering of dimensionality reduction results through various methods (I would recommend using the "louvain" algorithm on UMAP results for single-cell analysis)
plot_dimensionality_reductionand plot_true_clusters: Plots the results, depending on whether you want provided labels (inferred from the metadata) or labels inferred from the clustering
plot_cluster_waveforms: plots the representative waveforms of the units associated with each cluster. This gives a pretty intuitive impression on whether the clusters really represent distinct action potential shapes or not (which we would assume in the case of interneurons).

The documentation for the RecordingGroup functions is still pretty bad, but they should do what you are trying to do.

As a general remark, I am currently working on building an EXC/INH classifier that hopefully generalizes to other (rodent) cultures as well. I've been working on this question quite a bit and I think that without any sort of ground truth the differences are to gradual to get clear cluster separations. We will probably put out a preprint in 1-2 months.

mandarmp commented 7 months ago

Thank you for patiently answering my questions. I will try to analyse the templates to see whether I have those NAN values.

And may I know which kind of network scan configurations did you use,. the normal 1024 highest FR/SA electrodes or neuronal units. I have been advised, the resolution of the recordings are very important for spikesorting.

hornauerp commented 7 months ago

I always use neuronal units. If the neurons are nicely distributed on the chip, 3x3 should be enough, for more "clumpy" cultures I even go up to 4x4 electrode squares.

mandarmp commented 7 months ago

Thanks for all the insights,

Regarding the error again, it fails while computing the inferWaveformFeatures, the norm_wf_matrix sent to this function doesnt contain any NAN values.

`
    function waveform_features = inferWaveformFeatures(obj,max_amplitudes, norm_wf_matrix)
        interpolation_factor = 10;
        ms_conversion = 10 * interpolation_factor; %Need to find a way to automate that, maybe 10/ms is standard for Ph?(obj.RecordingInfo.SamplingRate / 1000) * interpolation_factor;
        x = 1:size(norm_wf_matrix,1);
        xq = 1:(1/interpolation_factor):size(norm_wf_matrix,1);
        interp_wf_matrix = double(interp1(x,norm_wf_matrix,xq,'pchip'));
        zci = @(v) find(v(:).*circshift(v(:), [-1 0]) <= 0); %Function to detect zero crossings
        tx = zci(interp_wf_matrix); %Apply to interpolated waveform matrix
        [sample_idx,electrode_idx] = ind2sub(size(interp_wf_matrix),tx);
        unit_zero_crossings = splitapply(@(x) {x},sample_idx,electrode_idx);
        [unit_trough_value,unit_trough_idx] = min(interp_wf_matrix);
        peak_1_cutout = interp_wf_matrix(unit_trough_idx - ms_conversion:unit_trough_idx,:); 
        peak_2_cutout = interp_wf_matrix(unit_trough_idx:unit_trough_idx + ms_conversion,:);
        `

The code fails at peak_1_cutout as the interp_wf_matrix the minimum values for this example occur at unit_through_idx of 91 , so subtracting it by ms_conversion with value 100 leads to this error. May be if we use only 50 samples to the left and right as peak cutouts. What is your say on this?

hornauerp / DeePhys

Error in FeatureExtraction for maxtwo data and few other questions. #3