morris-lab / CellOracle

This is the alpha version of the CellOracle package
Other
292 stars 49 forks source link

Problem when using a custom genomes_dir for TF scan #193

Open VictorCurean opened 3 months ago

VictorCurean commented 3 months ago

Hi,

I had a very weird bug when scanning TF binding motifs.

I followed the instructions from the notebook but with the added change of specifying a custom genomes directory:

`ref_genome = "hg38"

genome_installation = ma.is_genome_installed(ref_genome=ref_genome, genomes_dir="./genomes") print(ref_genome, "installation: ", genome_installation)`

This worked just fine as the output was:

hg38 installation: True

Next, I have loaded my peaks file, and ran the TF scan command:

tfi = ma.TFinfo(peak_data_frame=peaks, ref_genome=ref_genome, genomes_dir="./genomes")

tfi.scan(fpr=0.02, motifs=None, verbose=True) tfi.to_hdf5(file_path="T98G.celloracle.tfinfo")

The first time I ran this on my notebook I got a FileNotFounderror for my genome, since it was trying to search in my home folder under usr/.local/share/genomes/hg38.

Weirdly enough when I ran these again, I did not get the same error, but the execution just got stuck at: DEBUG - using background: genome hg38 with size 200.

Circumventing this issue is pretty straight forward - you move your genome in your home folder and everything worked fine after that. But this took a lot of time to figure out since you only get the error the first time you run it, which I assume has to do with the way set_backround is implemented in gimmemotifs.

Posting this for future reference, for anyone who got stuck at this step.

siduanmiao commented 3 months ago

Thank you for your kindness! It's really helpfule!

huzhimaye commented 2 months ago

Just change the 400th line in celloracle/motif_analysis/tfinfo_core.py

from

s.set_background(genome=self.ref_genome, size=background_length) # For gimmemotifs ver 14.4

to

s.set_background(genome=os.path.join(self.genomes_dir, self.ref_genome), size=background_length)

should fix it.