aertslab / pycisTopic

pycisTopic is a Python module to simultaneously identify cell states and cis-regulatory topics from single cell epigenomics data.
Other
56 stars 11 forks source link

Errors in reading fragments by export_pseudobulk [BUG] #70

Closed ruan2ruan closed 1 year ago

ruan2ruan commented 1 year ago

Hi, I'm having trouble using the export_pseudobulk function:

bw_paths, bed_paths = export_pseudobulk(input_data = cell_data, variable = 'celltype', sample_id_col = 'sample_id', chromsizes = chromsizes, bed_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bed_files/'), bigwig_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bw_files/'), path_to_fragments = fragments_dict, n_cpu=8, normalize_bigwig = True, remove_duplicates = True, split_pattern = '-') 2023-02-17 16:22:04,428 cisTopic INFO Reading fragments from AML_Multiomics/data/AML323_Ctrl_atac_fragments.tsv.gz Traceback (most recent call last): File "pyarrow/_csv.pyx", line 1226, in pyarrow._csv.read_csv File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 133, in pyarrow.lib.check_status pyarrow.lib.ArrowCancelled: Operation cancelled. Detail: received signal 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/pseudobulk_peak_calling.py", line 116, in export_pseudobulk fragments_df = read_fragments_from_file(path_to_fragments[sample_id], use_polars=use_polars).df File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/utils.py", line 395, in read_fragments_from_file pl.read_csv( File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/polars/io.py", line 252, in read_csv tbl = pa.csv.read_csv( File "pyarrow/_csv.pyx", line 1217, in pyarrow._csv.read_csv File "pyarrow/error.pxi", line 248, in pyarrow.lib.SignalStopHandler.exit

Here is my code:

import os import pycisTopic work_dir='AML_Multiomics' fragments_dict = {'AML323_Ctrl': os.path.join(work_dir,'data/AML323_Ctrl_atac_fragments.tsv.gz')} import scanpy as sc adata = sc.read_h5ad(os.path.join(work_dir,'scRNA/adata.h5ad')) cell_data=adata.obs cell_data['sample_id']='AML323_Ctrl' cell_data['celltype']=cell_data['celltype'].astype(str) del(adata) import requests import pyranges as pr import pandas as pd chromsizes=pd.read_csv(os.path.join(workdir,'hg38.chrom.sizes'),sep='\t',header=None) chromsizes.columns=['Chromosome','End'] chromsizes['Start']=[0]*chromsizes.shape[0] chromsizes=chromsizes.loc[:,['Chromosome','Start','End']] chromsizes['Chromosome']=[chromsizes['Chromosome'][x].replace('v','.') for x in range(len(chromsizes['Chromosome']))] chromsizes['Chromosome']=[chromsizes['Chromosome'][x].split('')[1] if len(chromsizes['Chromosome'][x].split('_')) >1 else chromsizes['Chromosome'][x] for x in range(len(chromsizes['Chromosome']))] chromsizes=pr.PyRanges(chromsizes) from pycisTopic.pseudobulk_peak_calling import export_pseudobulk

Here is my version of some tools:

pycisTopic.version '1.0.2.dev21+g219225d' scanpy.version '1.8.2' polars.version '0.16.4' pandas.version '1.5.3'

I also tried to use pandas to read (use_polars=False), but it did not work:

bw_paths, bed_paths = export_pseudobulk(input_data = cell_data, variable = 'celltype', sample_id_col = 'sample_id', chromsizes = chromsizes, bed_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bed_files/'), bigwig_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bw_files/'), path_to_fragments = fragments_dict, use_polars=False, n_cpu=8, normalize_bigwig = True, remove_duplicates = True, split_pattern = '-') 2023-02-17 13:54:53,368 cisTopic INFO Reading fragments from AML_Multiomics/data/AML323_Ctrl_atac_fragments.tsv.gz Traceback (most recent call last): File "", line 1, in File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/pseudobulk_peak_calling.py", line 116, in export_pseudobulk fragments_df = read_fragments_from_file(path_to_fragments[sample_id], use_polars=use_polars).df File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/utils.py", line 418, in read_fragments_from_file df = pd.read_table( File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper return func(*args, *kwargs) File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper return func(args, **kwargs) File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1289, in read_table return _read(filepath_or_buffer, kwds) File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 611, in _read return parser.read(nrows) File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1778, in read ) = self._engine.read( # type: ignore[attr-defined] File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 232, in read data = _concatenate_chunks(chunks) File "/home/xruan/.conda/envs/scenicplus/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 402, in _concatenate_chunks result[name] = np.concatenate(arrs) # type: ignore[arg-type] File "<__array_function__ internals>", line 180, in concatenate

Could you help me to solve this problem?

SeppeDeWinter commented 1 year ago

Hi @ruan2ruan

Can you show me the head of your fragments file please?

Best,

Seppe

ruan2ruan commented 1 year ago

Hi @SeppeDeWinter , thanks for your reply! Sorry to bother you, I found that I was the problem caused by my server and I haved slolved it. Thanks again.