phbradley / conga

Clonotype Neighbor Graph Analysis
MIT License
79 stars 18 forks source link

Writing out all_nbrs dictionary for reruns. #38

Closed sschattgen closed 2 years ago

sschattgen commented 2 years ago

Currently, the TCR and GEX nbr fractions (all_nbrs) are stored in a dictionary separate from the anndata object. The dictionary is not currently written out and needs to be rebuilt before rerunning or replotting results from a previous analysis, which can take a long time on large datasets. I've had success storing the dictionary in pickle format and reusing it for plotting. Might be worth writing out in the run_conga.py script.

import conga 
import pandas as pd
import numpy as np
import scanpy as sc
import pickle

out_pre = '~/test/some_study'
adata_file = out_pre + '.h5ad'
pickle_file = out_pre + "nbrs.pkl"

nbr_fracs = [0.01, 0.1]
all_nbrs = conga.preprocess.calc_nbrs( adata, nbr_fracs) 

f = open(pickle_file,"wb")
pickle.dump(all_nbrs,f)
f.close()

all_nbrs = pd.read_pickle(pickle_file)
sschattgen commented 2 years ago

Added as the --reuse_nbrs option in run_conga.py currently in the bcr branch.