Open mandarmp opened 8 months ago
I want to run it on a single recording and get the different cell type
unitFeatTable = getUnitFeatures(rec_array.obj,["ReferenceWaveform","ActivityFeatures"]);
numFeatures = size(unitFeatTable, 2); % 35 features
normalizedUnitFeatTable = unitFeatTable; % Initialize with original data
for i = 1:numFeatures
meanVal = mean(unitFeatTable{:, i});
stdVal = std(unitFeatTable{:, i});
normalizedUnitFeatTable{:, i} = (unitFeatTable{:, i} - meanVal) / stdVal;
end
% Identify numeric columns
numericCols = varfun(@isnumeric, normalizedUnitFeatTable, 'OutputFormat', 'uniform');
% Convert only numeric columns to an array
numericDataMatrix = table2array(normalizedUnitFeatTable(:, numericCols));
[reduction, umap, clusterIdentifiers, extras] = run_umap(numericDataMatrix,'n_components',2,'n_neighbors',100,'min_dist',0.1,'cluster_detail','adaptive','spread',1,'sgd_tasks',20,...
'verbose','none');
figure;
gscatter(reduction(:,1), reduction(:,2), clusterIdentifiers);
title('UMAP Reduction with Cluster Identifiers');
xlabel('UMAP 1');
ylabel('UMAP 2');
grid on;
is this the right approach, I dont want to reivent the wheel. I suppose your routine , aggregates many cultures?
Regarding your error, I think I ran into similar problems before. Could you check your template matrix if any of the templates consist only of NaNs?
Regarding your questions:
So for your second post, there are already some functions implemented that do what you are doing in your code. You can take a look at the RecordingGroup functions (you can also initialize it with only one recording):
reduceDimensionality
: implements PCA, tSNE and UMAP on the feature groups selected by you with optional normalizationclusterByFeatures
: allows clustering of dimensionality reduction results through various methods (I would recommend using the "louvain" algorithm on UMAP results for single-cell analysis)plot_dimensionality_reduction
and plot_true_clusters
: Plots the results, depending on whether you want provided labels (inferred from the metadata) or labels inferred from the clusteringplot_cluster_waveforms
: plots the representative waveforms of the units associated with each cluster. This gives a pretty intuitive impression on whether the clusters really represent distinct action potential shapes or not (which we would assume in the case of interneurons).The documentation for the RecordingGroup functions is still pretty bad, but they should do what you are trying to do.
As a general remark, I am currently working on building an EXC/INH classifier that hopefully generalizes to other (rodent) cultures as well. I've been working on this question quite a bit and I think that without any sort of ground truth the differences are to gradual to get clear cluster separations. We will probably put out a preprint in 1-2 months.
Thank you for patiently answering my questions. I will try to analyse the templates to see whether I have those NAN values.
And may I know which kind of network scan configurations did you use,. the normal 1024 highest FR/SA electrodes or neuronal units. I have been advised, the resolution of the recordings are very important for spikesorting.
I always use neuronal units. If the neurons are nicely distributed on the chip, 3x3 should be enough, for more "clumpy" cultures I even go up to 4x4 electrode squares.
Thanks for all the insights,
Regarding the error again, it fails while computing the inferWaveformFeatures, the norm_wf_matrix sent to this function doesnt contain any NAN values.
`
function waveform_features = inferWaveformFeatures(obj,max_amplitudes, norm_wf_matrix)
interpolation_factor = 10;
ms_conversion = 10 * interpolation_factor; %Need to find a way to automate that, maybe 10/ms is standard for Ph?(obj.RecordingInfo.SamplingRate / 1000) * interpolation_factor;
x = 1:size(norm_wf_matrix,1);
xq = 1:(1/interpolation_factor):size(norm_wf_matrix,1);
interp_wf_matrix = double(interp1(x,norm_wf_matrix,xq,'pchip'));
zci = @(v) find(v(:).*circshift(v(:), [-1 0]) <= 0); %Function to detect zero crossings
tx = zci(interp_wf_matrix); %Apply to interpolated waveform matrix
[sample_idx,electrode_idx] = ind2sub(size(interp_wf_matrix),tx);
unit_zero_crossings = splitapply(@(x) {x},sample_idx,electrode_idx);
[unit_trough_value,unit_trough_idx] = min(interp_wf_matrix);
peak_1_cutout = interp_wf_matrix(unit_trough_idx - ms_conversion:unit_trough_idx,:);
peak_2_cutout = interp_wf_matrix(unit_trough_idx:unit_trough_idx + ms_conversion,:);
`
The code fails at peak_1_cutout as the interp_wf_matrix the minimum values for this example occur at unit_through_idx of 91 , so subtracting it by ms_conversion with value 100 leads to this error. May be if we use only 50 samples to the left and right as peak cutouts. What is your say on this?
You can assign this a label, "question", also it would be great to have a discussion page on the repo
Iam trying to understand the modules , but while trying to run featureextaraction on a sorted data from a maxtwo recording I faced this error, any insights on this:
I initially suspected it was due to the maxtwo sampling but few other examples worked fine. any insights on these be great.
also I have a few more questions,
1) do you sort on the activity assays ( as in Silvia Ronchis paper)?
2) Why use KS 2.5, i was using KS2 since the cells are stationary.
3) When I went through your paper, I understood for a culture representative unit and waveforrm features were calculated, since we use mouse primary neurons, there is a mix of different cells, i want to segregate them into excitatory and inhibitory. Can I use your module to this, any thoughts on this. Thank you for your time.