loosolab / TF-COMB

Transcription Factor Co-Occurrence using Market Basket analysis
https://tf-comb.readthedocs.io
MIT License
10 stars 1 forks source link

Unable to plot distances for stranded TFBS #54

Closed msbentsen closed 1 year ago

msbentsen commented 1 year ago

I received this error from a user and was able to reproduce. When running count_within with stranded=True, and subsequent distance analysis, it is not possible to plot the distribution of TF pair distances.

Steps to reproduce:

import tfcomb.objects

C = tfcomb.objects.CombObj()
C.TFBS_from_motifs(regions="../data/GM12878_hg38_chr4_ATAC_peaks.bed", 
                   motifs="../data/HOCOMOCOv11_HUMAN_motifs.txt",
                   genome="../data/hg38_chr4.fa.gz", 
                   threads=4)
C.count_within(max_overlap=0.0, threads=4, stranded=True)
C.market_basket()

selection = C.select_significant_rules(x_threshold=0.5)

selection.analyze_distances(threads=6)  

TF1, TF2 = selection.distObj.peaks.iloc[0,:2]

selection.distObj.plot((TF1, TF2), style="hist")

This occurs due to a discrepancy between selection.count_names and selection.distObj.TF_names

I will create a PR with a fix, but I might ask for some help with review.

vheger commented 1 year ago

fixed with #56