Closed TingTingShao closed 8 months ago
Update:
The anndata (from snapATAC2) is generated after QC, though the first run with the earlier version of pycistopic has filtered more cells in the QC step, I think it might be okay to create cistopic with cells filtered in snapATAC2.
So I tried:
rule create_cisobject:
input:
# adata="microglia_1.h5ad",
frag_paths_pkl="030results/frag_paths.pkl",
tsv="030results/cell_data.tsv",
consensus_regions="033results_consensus_peak_calling/consensus_regions.bed"
output:
obj_pkl="cistopic_obj1.pkl"
run:
cell_data = pd.read_csv(input.tsv, sep = '\t')
cell_data['barcode'] = cell_data['sample']+':'+ cell_data['barcode']
fragments_dict=pickle.load(open(input.frag_paths_pkl, 'rb'))
unique_samples = set(cell_data['sample'])
path_to_regions= {sample: input.consensus_regions for sample in unique_samples}
cistopic_obj_list=[create_cistopic_object_from_fragments(path_to_fragments=fragments_dict[key],
path_to_regions=path_to_regions[key],
path_to_blacklist=config['path_to_blacklist'],
# metrics=metadata_bc[key],
valid_bc=cell_data['barcode'],
n_cpu=1,
project=key) for key in fragments_dict.keys()]
cistopic_obj = merge(cistopic_obj_list)
cistopic_obj.add_cell_data(cell_data[['sample']])
pickle.dump(cistopic_obj,
open(output.obj_pkl, 'wb'))
However error came out as:
Length mismatch: Expected axis has 9373 elements, new values have 9265 elements
File "/vsc-hard-mounts/leuven-data/351/vsc35107/master_thesis/pycistopic_pipeline/Snakefile", line 244, in __rule_create_cisobject
File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cistopic_class.py", line 140, in add_cell_data
File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 5920, in __setattr__
File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.__set__
File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 822, in _set_axis
File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 228, in set_axis
File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/base.py", line 70, in _validate_set_axis
Any idea on this?
Supplementary info:
AnnData object with n_obs × n_vars = 8458 × 6062095
obs: 'sample', 'region', 'subject', 'ad', 'tsse', 'leiden_mnc_1'
var: 'count', 'selected'
uns: 'AnnDataSet', 'reference_sequences', 'spectral_eigenvalue'
obsm: 'X_spectral', 'X_spectral_mnc_sample', 'X_umap', 'X_umap_mnc_sample', 'fragment_paired'
obsp: 'distances'
Thanks tingting
Solved with
run:
cell_data = pd.read_csv(input.tsv, sep = '\t')
cell_data['barcode'] = cell_data['sample']+':'+ cell_data['barcode']
fragments_dict=pickle.load(open(input.frag_paths_pkl, 'rb'))
unique_samples = set(cell_data['sample'])
path_to_regions= {sample: input.consensus_regions for sample in unique_samples}
cistopic_obj_list=[create_cistopic_object_from_fragments(path_to_fragments=fragments_dict[key],
path_to_regions=path_to_regions[key],
path_to_blacklist=config['path_to_blacklist'],
# metrics=metadata_bc[key],
valid_bc=cell_data['barcode'],
n_cpu=1,
split_pattern= ":",
project=key) for key in fragments_dict.keys()]
cistopic_obj = merge(cistopic_obj_list)
cistopic_obj.add_cell_data(cell_data[['sample']])
pickle.dump(cistopic_obj,
open(output.obj_pkl, 'wb'))
Update:
The anndata (from snapATAC2) is generated after QC, though the first run with the earlier version of pycistopic has filtered more cells in the QC step, I think it might be okay to create cistopic with cells filtered in snapATAC2.
So I tried:
rule create_cisobject: input: # adata="microglia_1.h5ad", frag_paths_pkl="030results/frag_paths.pkl", tsv="030results/cell_data.tsv", consensus_regions="033results_consensus_peak_calling/consensus_regions.bed" output: obj_pkl="cistopic_obj1.pkl" run: cell_data = pd.read_csv(input.tsv, sep = '\t') cell_data['barcode'] = cell_data['sample']+':'+ cell_data['barcode'] fragments_dict=pickle.load(open(input.frag_paths_pkl, 'rb')) unique_samples = set(cell_data['sample']) path_to_regions= {sample: input.consensus_regions for sample in unique_samples} cistopic_obj_list=[create_cistopic_object_from_fragments(path_to_fragments=fragments_dict[key], path_to_regions=path_to_regions[key], path_to_blacklist=config['path_to_blacklist'], # metrics=metadata_bc[key], valid_bc=cell_data['barcode'], n_cpu=1, project=key) for key in fragments_dict.keys()] cistopic_obj = merge(cistopic_obj_list) cistopic_obj.add_cell_data(cell_data[['sample']]) pickle.dump(cistopic_obj, open(output.obj_pkl, 'wb'))
However error came out as:
Length mismatch: Expected axis has 9373 elements, new values have 9265 elements File "/vsc-hard-mounts/leuven-data/351/vsc35107/master_thesis/pycistopic_pipeline/Snakefile", line 244, in __rule_create_cisobject File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cistopic_class.py", line 140, in add_cell_data File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 5920, in __setattr__ File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.__set__ File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 822, in _set_axis File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 228, in set_axis File "/data/leuven/351/vsc35107/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/base.py", line 70, in _validate_set_axis
Any idea on this?
Supplementary info:
AnnData object with n_obs × n_vars = 8458 × 6062095 obs: 'sample', 'region', 'subject', 'ad', 'tsse', 'leiden_mnc_1' var: 'count', 'selected' uns: 'AnnDataSet', 'reference_sequences', 'spectral_eigenvalue' obsm: 'X_spectral', 'X_spectral_mnc_sample', 'X_umap', 'X_umap_mnc_sample', 'fragment_paired' obsp: 'distances'
Thanks tingting
I have also run snapatac2 for my dataset (1-healthy and 2-patients concantated). Can you please help in how can I use the anndata from snapatac2 for pycisTopic?
Hi,
With earlier version of pycistopic, I was able to run:
But with version 2.0a0, I was not able to run this commad, I guess now it follows
But this command returns different data formats than before, I am following the tutorial https://scenicplus.readthedocs.io/en/latest/mix_melanoma_cell_lines.html
Could I please ask
Before you suggested two solutions in my case https://github.com/aertslab/pycisTopic/discussions/116
But with the solution
Error persists
I tried
error
I followed the thread here https://github.com/aertslab/pycisTopic/issues/40 but it consumes a lot of memroty
Looking forward to your reply Many thanks tingting