Hello! The matrix needs to have the imputed normalized counts, preferably using Magic. The number of rows needs to be the sum of the targets you want to use (example: 1000) plus the number of TF to use as head of the regulons (example 100). So we have a matrix sized as 1100xNcells.

As we use reticulate, we need to make sure to use the same version to create the pickles, otherwise may raise compatibility errors, so I share two options:

Create the pickles directly from R:


library(reticulate)
reticulate::use_python("/usr/bin/python3")
py_discover_config("magic")
library(Rmagic)

data_MAGIC_df <- data_MAGIC_df[,c(top_MAD_tfs,top_MAD_targets)] #this is the matrix of 1100xNcells reticulate::py_save_object(as.data.frame(data_MAGIC_df), filename = paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_DF.pickle")) reticulate::py_save_object(TFs, filename = paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_TF.pickle")) #this is the list of the TFs

2. Save the matrices as csv, and create the pickle on python:

R

library(reticulate) reticulate::use_python("/usr/bin/python3") py_discover_config("magic") library(Rmagic)

data_MAGIC_df <- data_MAGIC_df[,c(top_MAD_tfs,top_MAD_targets)] #this is the matrix of 1100xNcells write.table(data_MAGIC_df, file= paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_DF.csv"), sep='\t', row.names= TRUE, col.names=TRUE, quote=FALSE) write.table(TFs, file= paste0('./Data/SimiC/Organoids_RUN1',MAX_NUM_TARGETS, "_TF.csv"), sep='\t', row.names= FALSE, col.names=FALSE, quote=FALSE)

Python

import pandas as pd import pickle import os

print('Creating pickles!') analysis_dir= '/root/SimiC/Organoids'

sample = 'Organoids_RUN11000' DF_p = os.path.join(analysis_dir, sample+'_DF.csv') TF_p = os.path.join(analysis_dir, sample+'_TF.csv')

if os.path.exists(DF_p): DF=pd.read_csv(DF_p,header=0, delimiter="\t") DF_pickle = os.path.join(analysis_dir, sample+ '.DF.pickle')

DF.to_pickle(DF_pickle)

with open(DF_pickle, 'wb') as mypickle:
    pickle.dump(DF,mypickle)

else: print("The file" + DF_p +" does not exist")

if os.path.exists(TF_p): TF=pd.read_csv(TF_p, header=None) TF=list(TF.iloc[:,0]) TF_pickle = os.path.join(analysis_dir, sample+ '.TF.pickle') with open(TF_pickle, 'wb') as mypickle: pickle.dump(TF,mypickle) else: print("The file" + TF_p +" does not exist")

print ('Done pickles!')



I hope that helps!

jianhao2016 / SimiC

Preparing DGE matrix for input to SimiC #8

R

Python

DF.to_pickle(DF_pickle)