denisecailab / minian

miniscope analysis pipeline with interactive visualizations
GNU General Public License v3.0
91 stars 35 forks source link

Extracting Time Series Data and Calcium Traces CSV file #265

Closed skisurfer13 closed 6 months ago

skisurfer13 commented 6 months ago

Greetings!

I am currently trying to extract the time series data from the Minian pipeline. I was able to successfully extract the A,C,S data using the following block of code:

minian_ds = open_minian(minian_ds_path)
minian_ds["C"].rename("C").to_series().reset_index().to_csv("C.csv")
minian_ds["A"].rename("A").to_series().reset_index().to_csv("A.csv")
minian_ds["S"].rename("S").to_series().reset_index().to_csv("S.csv")

However, the CSV data that I obtain consists of only one column of the Calcium Traces. To the extent of my understanding, this only represents the calcium traces of one cell. My goal is to find the calcium traces of several cells (like a time series data). The CSV file I obtained from Minian is attached below (named as C.csv) and the CSV file I wish to obtain is also attached below (named as C_traces.csv) Please let me know how I can obtain the calcium traces of multiple cells (a file similar to C_traces.csv).

Thank you!

C.csv C_traces.csv

austinbaggetta commented 6 months ago

Hi! I've been using the following code to save the final variables to .csv files:

%%time
import csv

if not os.path.exists(csv_path):
    os.makedirs(csv_path)

for minian_output in ['S', 'C', 'A', 'max_proj']:
    if minian_output == 'S':
        data = S.values
    elif minian_output == 'C':
        data = C.values
    elif minian_output == 'A':
        data = A.values
    elif minian_output == 'max_proj':
        data = max_proj.values
    else:
        raise Exception('Incorrect minian_output!')

    result_path = os.path.join(csv_path, f'{minian_output}.csv')
    with open(result_path, "w+") as my_csv:
        csvWriter = csv.writer(my_csv, delimiter = ',')
        csvWriter.writerows(data)

I hope this helps!

skisurfer13 commented 6 months ago

Hey! Thank you for your response. I consulted Dr. Phil Dong and apparently the data is stored to the csv file in the long/narrow format. I used the following code to convert it to the wide format where each column C_i represents Calcium Traces of unit_id i

import pandas as pd

# Read the CSV file into a DataFrame
df = pd.read_csv("C_data.csv")

# Drop the 'unit_labels' column
df.drop(columns=['unit_labels'], inplace=True)

# Convert to wide format
wide_df = df.pivot_table(index=['frame', 'animal', 'session'], columns='unit_id').reset_index()

# Drop the hierarchical column index for a cleaner DataFrame
wide_df.columns = wide_df.columns.droplevel()

wide_df.columns = ['Time', 'Animal', 'Session'] + [f'C_{col}' for col in wide_df.columns[3:]]

# Export to CSV
wide_df.to_csv("wide_data.csv", index=False)