morris-lab / CellOracle

This is the alpha version of the CellOracle package
Other
316 stars 56 forks source link

Scan motifs error! tfi.scan(fpr=0.02,motifs=motifs, verbose=True) #216

Open zjprookie opened 2 months ago

zjprookie commented 2 months ago

I used python3.10.14 and celloracle0.18.0,There was a problem when I ran the following code to handle my own baseGRN data, tfi.scan(fpr=0.02,motifs=motifs, verbose=True) I have Peaks after filtering: 15,874,But it runs for days at a time, and Last error

2024-09-26 19:56:06,277 - DEBUG - using background: genome mm10 with size 200 Peaks before filtering: 15876 Peaks with invalid chr_name: 0 Peaks with invalid length: 2 Peaks after filtering: 15874 Checking your motifs... Motifs format looks good.

Initiating scanner...

Calculating FPR-based threshold. This step may take substantial time when you load a new ref-genome. It will be done quicker on the second time.

Traceback (most recent call last): File "/media/hdd/zjp_workplace/my-celloracle/mytest/test.py", line 34, in tfi.scan(fpr=0.02, File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/tfinfo_core.py", line 411, in scan target_sequences = peak2fasta(peak_ids=self.all_peaks, ref_genome=self.ref_genome, genomes_dir=self.genomes_dir) File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/process_bed_file.py", line 207, in peak2fasta name, seq = peak2seq(peak_id) File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/process_bed_file.py", line 196, in peak2seq tmp = genome_data[chromosome_name][locus[0]:locus[1]] File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 914, in getitem return self._fa.get_seq(self.name, start + 1, stop)[::step] File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 1143, in get_seq seq = self.faidx.fetch(name, start, end) File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 721, in fetch seq = self.from_file(name, start, end) File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 765, in from_file chunk_seq = self.file.read(chunk).decode() File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 721, in read self._load_block() # will reset offsets File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 644, in _load_block block_size, self._buffer = _load_bgzf_block(handle, self._text) File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 477, in _load_bgzf_block raise RuntimeError("Decompressed to %i, not %i" % (len(data), expected_size)) RuntimeError: Decompressed to 65283, not 65280

The same is true when I test sample data,so I want to know how to solve the problem, thank you