JiekaiLab / scTE

MIT License
87 stars 27 forks source link

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated #43

Open alexlenail opened 1 year ago

alexlenail commented 1 year ago

I filtered the bam files using this awk command:

samtools view possorted_genome_bam.bam -h | awk '/^@/ || /CB:/ && /UB:/' | samtools view -h -b > possorted_genome_bam.filtered.bam

then I do:

scTE -i possorted_genome_bam.filtered.bam -o out -x /home/lenail/scTE/hg38.exclusive.idx --hdf5 True -CB CB -UMI UB --thread 4

but I get this error:

DEBUG   : Creating converter from 7 to 5
DEBUG   : Creating converter from 5 to 7
DEBUG   : Creating converter from 7 to 5
DEBUG   : Creating converter from 5 to 7
INFO    : Parameter list:
Sample = /net/bmc-lab5/data/kellis/users/lenail/PFC_aging/scTE/D19-4296/out
Reference annotation index = /home/lenail/scTE/hg38.exclusive.idx
Minimum number of genes required = 200
Minimum number of counts required = None
Number of threads = 4

INFO    : Loading the genome annotation index... 2022-08-25 21:40:48
INFO    : Loaded '/home/lenail/scTE/hg38.exclusive.idx' binary file with 4778929 items
INFO    : Finished loading the genome annotation index... 2022-08-25 21:41:42

INFO    : Processing BAM/SAM files ...2022-08-25 21:41:42
INFO    : Input SAM/BAM file appears to be valid
INFO    : Done BAM/SAM files processing ...2022-08-25 23:58:40

INFO    : Splitting ...2022-08-25 23:58:40
INFO    : Executing multiple thread path with 4 threads
['1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '3', '4', '5', '6', '7', '8', '9', 'M', 'X', 'Y']
CB UB good

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated

gzip: out_scTEtmp/o1/out.bed.gz: invalid compressed data--format violated
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/lenail/.conda/envs/py39/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/lenail/.conda/envs/py39/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/lenail/.conda/envs/py39/lib/python3.9/site-packages/scTE-1.0-py3.9.egg/scTE/base.py", line 366, in splitChr
    CRs[t[3]] += 1
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lenail/.conda/envs/py39/bin/scTE", line 4, in <module>
    __import__('pkg_resources').run_script('scTE==1.0', 'scTE')
  File "/home/lenail/.conda/envs/py39/lib/python3.9/site-packages/pkg_resources/__init__.py", line 672, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/lenail/.conda/envs/py39/lib/python3.9/site-packages/pkg_resources/__init__.py", line 1472, in run_script
    exec(code, namespace, namespace)
  File "/home/lenail/.conda/envs/py39/lib/python3.9/site-packages/scTE-1.0-py3.9.egg/EGG-INFO/scripts/scTE", line 169, in <module>
    main()
  File "/home/lenail/.conda/envs/py39/lib/python3.9/site-packages/scTE-1.0-py3.9.egg/EGG-INFO/scripts/scTE", line 134, in main
    pool.map(partial_work, chr_list)
  File "/home/lenail/.conda/envs/py39/lib/python3.9/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/lenail/.conda/envs/py39/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
IndexError: list index out of range

Any ideas?