aertslab / pycisTopic

pycisTopic is a Python module to simultaneously identify cell states and cis-regulatory topics from single cell epigenomics data.
Other
58 stars 12 forks source link

UnicodeDecodeError (QC step) [BUG] #154

Closed jklvrt closed 3 months ago

jklvrt commented 3 months ago

Cannot get output for QC on snATAC part

Occured in this step: (QC per sample) regions_bed_filename = os.path.join(out_dir, "consensus_peak_calling/consensus_regions.bed") tss_bed_filename = os.path.join(out_dir, "qc", "tss.bed")

pycistopic_qc_commands_filename = "pycistopic_qc_commands.txt"

Create text file with all pycistopic qc command lines.

with open(pycistopic_qc_commands_filename, "w") as fh: for sample, fragment_filename in fragments_dict.items(): print( "pycistopic qc", f"--fragments {fragment_filename}", f"--regions {regions_bed_filename}", f"--tss {tss_bed_filename}", f"--output {os.path.join(out_dir, "qc")}/{sample}", sep=" ", file=fh, )

followed by this from terminal (with binaries on PATH) cat pycistopic_qc_commands.txt | parallel -j 4 {}

Error output Traceback (most recent call last): File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/bin/pycistopic", line 8, in sys.exit(main()) ^^^^^^ File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/pycistopic.py", line 26, in main args.func(args) File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/qc.py", line 233, in run_qc qc( File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/qc.py", line 133, in qc fragments_df_pl = read_fragments_to_polars_df( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/fragments.py", line 353, in read_fragments_to_polars_df fragments_df_pl = read_bed_to_polars_df( ^^^^^^^^^^^^^^^^^^^^^^ File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/fragments.py", line 232, in read_bed_to_polars_df for line in bed_fh: File "", line 322, in decode UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte Traceback (most recent call last): File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/bin/pycistopic", line 8, in sys.exit(main()) ^^^^^^ File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/pycistopic.py", line 26, in main args.func(args) File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/qc.py", line 233, in run_qc qc( File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/qc.py", line 133, in qc fragments_df_pl = read_fragments_to_polars_df( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/fragments.py", line 353, in read_fragments_to_polars_df fragments_df_pl = read_bed_to_polars_df( ^^^^^^^^^^^^^^^^^^^^^^ File "/home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/fragments.py", line 232, in read_bed_to_polars_df for line in bed_fh: File "", line 322, in decode

Expected behavior fragment files look fine (although extensions vary slightly, fragments are now in .bgz and index in .bgz.tbi), and running tabix -p {fragment_file} chr1:1-10000000 shows expectd output. previous steps were all fine too (export pseudobulk, peak calling etc.)

Screenshots

packages in environment at /home/jonathan/DATA2/home/jonathan/miniconda3/envs/scenicplus:

#

Name Version Build Channel

_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
adjusttext 1.0.4 pypi_0 pypi aiohttp 3.9.3 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi anndata 0.10.5.post1 pypi_0 pypi annoy 1.17.3 pypi_0 pypi appdirs 1.4.4 pypi_0 pypi arboreto 0.1.6 pypi_0 pypi argparse-dataclass 2.0.0 pypi_0 pypi array-api-compat 1.5.1 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi attr 0.3.2 pypi_0 pypi attrs 23.2.0 pypi_0 pypi backcall 0.2.0 pyhd3eb1b0_0 anaconda bbknn 1.6.0 pypi_0 pypi beautifulsoup4 4.12.3 pypi_0 pypi bidict 0.23.1 pypi_0 pypi bioservices 1.11.2 pypi_0 pypi blosc2 2.5.1 pypi_0 pypi bokeh 3.4.0 pypi_0 pypi boltons 23.1.1 pypi_0 pypi bs4 0.0.2 pypi_0 pypi bzip2 1.0.8 h5eee18b_6
ca-certificates 2024.3.11 h06a4308_0
cattrs 23.2.3 pypi_0 pypi certifi 2024.2.2 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi click 8.1.7 pypi_0 pypi cloudpickle 3.0.0 pypi_0 pypi colorama 0.4.6 pypi_0 pypi colorlog 6.8.2 pypi_0 pypi comm 0.1.2 py311h06a4308_0 anaconda conda-inject 1.3.1 pypi_0 pypi configargparse 1.7 pypi_0 pypi connection-pool 0.0.3 pypi_0 pypi contourpy 1.2.0 pypi_0 pypi ctxcore 0.2.0 pypi_0 pypi cycler 0.12.1 pypi_0 pypi cython 0.29.37 pypi_0 pypi cytoolz 0.12.3 pypi_0 pypi dask 2024.2.1 pypi_0 pypi dataclasses-json 0.6.4 pypi_0 pypi datrie 0.8.2 pypi_0 pypi debugpy 1.6.7 py311h6a678d5_0 anaconda decorator 5.1.1 pyhd3eb1b0_0 anaconda dill 0.3.8 pypi_0 pypi distributed 2024.2.1 pypi_0 pypi docutils 0.20.1 pypi_0 pypi dpath 2.1.6 pypi_0 pypi easydev 0.13.1 pypi_0 pypi et-xmlfile 1.1.0 pypi_0 pypi executing 2.0.1 pypi_0 pypi fastjsonschema 2.19.1 pypi_0 pypi fbpca 1.0 pypi_0 pypi filelock 3.13.1 pypi_0 pypi fonttools 4.50.0 pypi_0 pypi frozendict 2.4.0 pypi_0 pypi frozenlist 1.4.1 pypi_0 pypi fsspec 2024.3.1 pypi_0 pypi future 1.0.0 pypi_0 pypi gensim 4.3.2 pypi_0 pypi geosketch 1.2 pypi_0 pypi gevent 24.2.1 pypi_0 pypi gitdb 4.0.11 pypi_0 pypi gitpython 3.1.42 pypi_0 pypi globre 0.1.5 pypi_0 pypi greenlet 3.0.3 pypi_0 pypi grequests 0.7.0 pypi_0 pypi gseapy 0.10.8 pypi_0 pypi h5py 3.10.0 pypi_0 pypi harmonypy 0.0.9 pypi_0 pypi humanfriendly 10.0 pypi_0 pypi idna 3.6 pypi_0 pypi igraph 0.11.4 pypi_0 pypi imageio 2.34.0 pypi_0 pypi immutables 0.20 pypi_0 pypi importlib-metadata 7.0.1 pypi_0 pypi importlib-resources 6.1.2 pypi_0 pypi interlap 0.2.7 pypi_0 pypi intervaltree 3.1.0 pypi_0 pypi ipykernel 6.25.0 py311h92b7b1e_0 anaconda ipython 8.22.2 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.3 pypi_0 pypi joblib 1.3.2 pypi_0 pypi jsonpickle 3.0.3 pypi_0 pypi jsonschema 4.21.1 pypi_0 pypi jsonschema-specifications 2023.12.1 pypi_0 pypi jupyter-core 5.7.2 pypi_0 pypi jupyter_client 8.6.0 py311h06a4308_0 anaconda jupyter_core 5.5.0 py311h06a4308_0 anaconda kaleido 0.2.1 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi lazy-loader 0.3 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1
lda 3.0.0 pypi_0 pypi leidenalg 0.10.2 pypi_0 pypi libffi 3.4.4 h6a678d5_1
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libsodium 1.0.18 h7b6447c_0 anaconda libstdcxx-ng 11.2.0 h1234567_1
libuuid 1.41.5 h5eee18b_0
line-profiler 4.1.2 pypi_0 pypi llvmlite 0.42.0 pypi_0 pypi locket 1.0.0 pypi_0 pypi loompy 3.0.7 pypi_0 pypi loomxpy 0.4.2 pypi_0 pypi lxml 5.1.0 pypi_0 pypi lz4 4.3.3 pypi_0 pypi macs2 2.2.9.1 pypi_0 pypi markdown-it-py 3.0.0 pypi_0 pypi markupsafe 2.1.5 pypi_0 pypi marshmallow 3.21.1 pypi_0 pypi matplotlib 3.6.3 pypi_0 pypi matplotlib-inline 0.1.6 py311h06a4308_0 anaconda mdurl 0.1.2 pypi_0 pypi mizani 0.9.3 pypi_0 pypi msgpack 1.0.8 pypi_0 pypi mudata 0.2.3 pypi_0 pypi multidict 6.0.5 pypi_0 pypi multiprocessing-on-dill 3.5.0a4 pypi_0 pypi mypy-extensions 1.0.0 pypi_0 pypi natsort 8.4.0 pypi_0 pypi nbformat 5.10.3 pypi_0 pypi ncls 0.0.68 pypi_0 pypi ncurses 6.4 h6a678d5_0
ndindex 1.8 pypi_0 pypi nest-asyncio 1.5.6 py311h06a4308_0 anaconda networkx 3.2.1 pypi_0 pypi numba 0.59.0 pypi_0 pypi numexpr 2.9.0 pypi_0 pypi numpy 1.26.4 pypi_0 pypi numpy-groupies 0.10.2 pypi_0 pypi openpyxl 3.1.2 pypi_0 pypi openssl 3.0.14 h5eee18b_0
packaging 24.0 pypi_0 pypi pandas 1.5.0 pypi_0 pypi parso 0.8.3 pyhd3eb1b0_0 anaconda partd 1.4.1 pypi_0 pypi patsy 0.5.6 pypi_0 pypi pexpect 4.9.0 pypi_0 pypi pickleshare 0.7.5 pyhd3eb1b0_1003 anaconda pillow 10.2.0 pypi_0 pypi pip 24.0 py311h06a4308_0
plac 1.4.3 pypi_0 pypi platformdirs 4.2.0 pypi_0 pypi plotly 5.19.0 pypi_0 pypi plotnine 0.12.4 pypi_0 pypi polars 0.20.13 pypi_0 pypi progressbar2 4.4.2 pypi_0 pypi prompt-toolkit 3.0.43 pypi_0 pypi protobuf 5.26.0 pypi_0 pypi psutil 5.9.8 pypi_0 pypi ptyprocess 0.7.0 pyhd3eb1b0_2 anaconda pulp 2.8.0 pypi_0 pypi pure_eval 0.2.2 pyhd3eb1b0_0 anaconda py-cpuinfo 9.0.0 pypi_0 pypi pyarrow 15.0.0 pypi_0 pypi pyarrow-hotfix 0.6 pypi_0 pypi pybedtools 0.9.1 pypi_0 pypi pybigtools 0.1.2 pypi_0 pypi pybigwig 0.3.22 pypi_0 pypi pybiomart 0.2.0 pypi_0 pypi pycistarget 1.0a2 pypi_0 pypi pycistopic 2.0a0 pypi_0 pypi pyfasta 0.5.2 pypi_0 pypi pygam 0.9.0 pypi_0 pypi pygments 2.17.2 pypi_0 pypi pynndescent 0.5.11 pypi_0 pypi pyparsing 3.1.2 pypi_0 pypi pyranges 0.0.111 pypi_0 pypi pyrle 0.0.39 pypi_0 pypi pysam 0.22.0 pypi_0 pypi pyscenic 0.12.1+8.gd2309fe pypi_0 pypi python 3.11.9 h955ad1f_0
python-dateutil 2.9.0.post0 pypi_0 pypi python-utils 3.8.2 pypi_0 pypi pytz 2024.1 pypi_0 pypi pyvis 0.3.2 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi pyzmq 25.1.0 py311h6a678d5_0 anaconda ray 2.9.3 pypi_0 pypi readline 8.2 h5eee18b_0
referencing 0.34.0 pypi_0 pypi requests 2.31.0 pypi_0 pypi requests-cache 1.2.0 pypi_0 pypi reretry 0.11.8 pypi_0 pypi rich 13.7.1 pypi_0 pypi rich-argparse 1.4.0 pypi_0 pypi rpds-py 0.18.0 pypi_0 pypi scanorama 1.7.4 pypi_0 pypi scanpy 1.8.2 pypi_0 pypi scatac-fragment-tools 0.1.0 pypi_0 pypi scenicplus 1.0a1 pypi_0 pypi scikit-image 0.22.0 pypi_0 pypi scikit-learn 1.3.2 pypi_0 pypi scipy 1.12.0 pypi_0 pypi scrublet 0.2.3 pypi_0 pypi seaborn 0.13.2 pypi_0 pypi setuptools 69.5.1 py311h06a4308_0
sinfo 0.3.4 pypi_0 pypi six 1.16.0 pyhd3eb1b0_1 anaconda smart-open 6.4.0 pypi_0 pypi smmap 5.0.1 pypi_0 pypi snakemake 8.5.5 pypi_0 pypi snakemake-interface-common 1.17.1 pypi_0 pypi snakemake-interface-executor-plugins 8.2.0 pypi_0 pypi snakemake-interface-report-plugins 1.0.0 pypi_0 pypi snakemake-interface-storage-plugins 3.1.1 pypi_0 pypi sorted-nearest 0.0.39 pypi_0 pypi sortedcontainers 2.4.0 pypi_0 pypi soupsieve 2.5 pypi_0 pypi sqlite 3.45.3 h5eee18b_0
stack-data 0.6.3 pypi_0 pypi stack_data 0.2.0 pyhd3eb1b0_0 anaconda statistics 1.0.3.5 pypi_0 pypi statsmodels 0.14.1 pypi_0 pypi stdlib-list 0.10.0 pypi_0 pypi stopit 1.1.2 pypi_0 pypi suds-community 1.1.2 pypi_0 pypi tables 3.9.2 pypi_0 pypi tabulate 0.9.0 pypi_0 pypi tblib 3.0.0 pypi_0 pypi tenacity 8.2.3 pypi_0 pypi texttable 1.7.0 pypi_0 pypi threadpoolctl 3.4.0 pypi_0 pypi throttler 1.2.2 pypi_0 pypi tifffile 2024.2.12 pypi_0 pypi tk 8.6.14 h39e8969_0
tmtoolkit 0.12.0 pypi_0 pypi toolz 0.12.1 pypi_0 pypi toposort 1.10 pypi_0 pypi tornado 6.4 pypi_0 pypi tqdm 4.66.2 pypi_0 pypi traitlets 5.14.2 pypi_0 pypi tspex 0.6.3 pypi_0 pypi typing 3.7.4.3 pypi_0 pypi typing-extensions 4.10.0 pypi_0 pypi typing-inspect 0.9.0 pypi_0 pypi tzdata 2024a h04d1e81_0
umap-learn 0.5.5 pypi_0 pypi url-normalize 1.4.3 pypi_0 pypi urllib3 2.2.1 pypi_0 pypi wcwidth 0.2.13 pypi_0 pypi wheel 0.43.0 py311h06a4308_0
wrapt 1.16.0 pypi_0 pypi xlrd 2.0.1 pypi_0 pypi xmltodict 0.13.0 pypi_0 pypi xyzservices 2023.10.1 pypi_0 pypi xz 5.4.6 h5eee18b_1
yarl 1.9.4 pypi_0 pypi yte 1.5.4 pypi_0 pypi zeromq 4.3.4 h2531618_0 anaconda zict 3.0.0 pypi_0 pypi zipp 3.18.1 pypi_0 pypi zlib 1.2.13 h5eee18b_1
zope-event 5.0 pypi_0 pypi zope-interface 6.2 pypi_0 pypi

Version (please complete the following information):

Additional context Add any other context about the problem here.

jklvrt commented 3 months ago

nevermind, this issue is solved (indeed an issue with bgzip/tabix formatting)!