I used python3.10.14 and celloracle0.18.0,There was a problem when I ran the following code to handle my own baseGRN data,
tfi.scan(fpr=0.02,motifs=motifs, verbose=True)
I have Peaks after filtering: 15,874,But it runs for days at a time, and Last error
2024-09-26 19:56:06,277 - DEBUG - using background: genome mm10 with size 200
Peaks before filtering: 15876
Peaks with invalid chr_name: 0
Peaks with invalid length: 2
Peaks after filtering: 15874
Checking your motifs... Motifs format looks good.
Initiating scanner...
Calculating FPR-based threshold. This step may take substantial time when you load a new ref-genome. It will be done quicker on the second time.
Traceback (most recent call last):
File "/media/hdd/zjp_workplace/my-celloracle/mytest/test.py", line 34, in
tfi.scan(fpr=0.02,
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/tfinfo_core.py", line 411, in scan
target_sequences = peak2fasta(peak_ids=self.all_peaks, ref_genome=self.ref_genome, genomes_dir=self.genomes_dir)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/process_bed_file.py", line 207, in peak2fasta
name, seq = peak2seq(peak_id)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/process_bed_file.py", line 196, in peak2seq
tmp = genome_data[chromosome_name][locus[0]:locus[1]]
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 914, in getitem
return self._fa.get_seq(self.name, start + 1, stop)[::step]
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 1143, in get_seq
seq = self.faidx.fetch(name, start, end)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 721, in fetch
seq = self.from_file(name, start, end)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 765, in from_file
chunk_seq = self.file.read(chunk).decode()
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 721, in read
self._load_block() # will reset offsets
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 644, in _load_block
block_size, self._buffer = _load_bgzf_block(handle, self._text)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 477, in _load_bgzf_block
raise RuntimeError("Decompressed to %i, not %i" % (len(data), expected_size))
RuntimeError: Decompressed to 65283, not 65280
The same is true when I test sample data,so I want to know how to solve the problem, thank you
I used python3.10.14 and celloracle0.18.0,There was a problem when I ran the following code to handle my own baseGRN data,
tfi.scan(fpr=0.02,motifs=motifs, verbose=True)
I have Peaks after filtering: 15,874,But it runs for days at a time, and Last errorInitiating scanner...
Calculating FPR-based threshold. This step may take substantial time when you load a new ref-genome. It will be done quicker on the second time.
Traceback (most recent call last): File "/media/hdd/zjp_workplace/my-celloracle/mytest/test.py", line 34, in
tfi.scan(fpr=0.02,
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/tfinfo_core.py", line 411, in scan
target_sequences = peak2fasta(peak_ids=self.all_peaks, ref_genome=self.ref_genome, genomes_dir=self.genomes_dir)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/process_bed_file.py", line 207, in peak2fasta
name, seq = peak2seq(peak_id)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/celloracle/motif_analysis/process_bed_file.py", line 196, in peak2seq
tmp = genome_data[chromosome_name][locus[0]:locus[1]]
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 914, in getitem
return self._fa.get_seq(self.name, start + 1, stop)[::step]
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 1143, in get_seq
seq = self.faidx.fetch(name, start, end)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 721, in fetch
seq = self.from_file(name, start, end)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/pyfaidx/init.py", line 765, in from_file
chunk_seq = self.file.read(chunk).decode()
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 721, in read
self._load_block() # will reset offsets
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 644, in _load_block
block_size, self._buffer = _load_bgzf_block(handle, self._text)
File "/media/ssd/anaconda3/envs/LINGER_1/lib/python3.10/site-packages/Bio/bgzf.py", line 477, in _load_bgzf_block
raise RuntimeError("Decompressed to %i, not %i" % (len(data), expected_size))
RuntimeError: Decompressed to 65283, not 65280
The same is true when I test sample data,so I want to know how to solve the problem, thank you