Open bpyenson opened 5 months ago
Hello! The matrix needs to have the imputed normalized counts, preferably using Magic. The number of rows needs to be the sum of the targets you want to use (example: 1000) plus the number of TF to use as head of the regulons (example 100). So we have a matrix sized as 1100xNcells.
As we use reticulate, we need to make sure to use the same version to create the pickles, otherwise may raise compatibility errors, so I share two options:
library(reticulate)
reticulate::use_python("/usr/bin/python3")
py_discover_config("magic")
library(Rmagic)
data_MAGIC_df <- data_MAGIC_df[,c(top_MAD_tfs,top_MAD_targets)] #this is the matrix of 1100xNcells reticulate::py_save_object(as.data.frame(data_MAGIC_df), filename = paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_DF.pickle")) reticulate::py_save_object(TFs, filename = paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_TF.pickle")) #this is the list of the TFs
2. Save the matrices as csv, and create the pickle on python:
library(reticulate) reticulate::use_python("/usr/bin/python3") py_discover_config("magic") library(Rmagic)
data_MAGIC_df <- data_MAGIC_df[,c(top_MAD_tfs,top_MAD_targets)] #this is the matrix of 1100xNcells write.table(data_MAGIC_df, file= paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_DF.csv"), sep='\t', row.names= TRUE, col.names=TRUE, quote=FALSE) write.table(TFs, file= paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_TF.csv"), sep='\t', row.names= FALSE, col.names=FALSE, quote=FALSE)
import pandas as pd import pickle import os
print('Creating pickles!') analysis_dir= '/root/SimiC/Organoids'
sample = 'Organoids_RUN11000' DF_p = os.path.join(analysis_dir, sample+'_DF.csv') TF_p = os.path.join(analysis_dir, sample+'_TF.csv')
if os.path.exists(DF_p): DF=pd.read_csv(DF_p,header=0, delimiter="\t") DF_pickle = os.path.join(analysis_dir, sample+ '.DF.pickle')
with open(DF_pickle, 'wb') as mypickle:
pickle.dump(DF,mypickle)
else: print("The file" + DF_p +" does not exist")
if os.path.exists(TF_p): TF=pd.read_csv(TF_p, header=None) TF=list(TF.iloc[:,0]) TF_pickle = os.path.join(analysis_dir, sample+ '.TF.pickle') with open(TF_pickle, 'wb') as mypickle: pickle.dump(TF,mypickle) else: print("The file" + TF_p +" does not exist")
print ('Done pickles!')
I hope that helps!
Hi,
I am having a hard time preparing my Seurat object for input to SimiC's pipeline. Is there any code you can share about how you created the .pickle file for the DGE matrix that has cell barcodes as rows and the genes (TF and targets) as the columns?
Thanks,