vanheeringen-lab / gimmemotifs

Suite of motif tools, including a motif prediction pipeline for ChIP-seq experiments. See full GimmeMotifs documentation for detailed installation instructions and usage examples.
https://gimmemotifs.readthedocs.io/en/master
MIT License
110 stars 33 forks source link

Provide background peaks file as .bed instead of FASTA? #277

Open apposada opened 2 years ago

apposada commented 2 years ago

Hello,

I am writing an issue of 'other' type as I am not sure where to ask. I am trying to run gimme motifs with a custom, non-genomepy genome and I am getting the following error:

(gimme_venv) alberto@nostromo:/mnt/sda/alberto/projects/ananse_smed/outputs/gimme$ gimme motifs -b 202110XX_peaks_goodformat.bed -g /mnt/sda/alberto/DATA/dynamic/genomes/Smed/Guo/Smed.fasta /mnt/sda/alberto/projects/ananse_smed/outputs/wgcna/20220812_wgcna_smed_black_assocpeaks.bed  outdir
2022-08-22 13:34:23,208 - INFO - starting full motif analysis
2022-08-22 13:34:23,223 - INFO - using size of 200, set size to 0 to use original region size
2022-08-22 13:34:23,224 - INFO - preparing input from BED
2022-08-22 13:34:23,338 - INFO - Copying custom background file 202110XX_peaks_goodformat.bed to outdir/intermediate/prediction.bg.fa.
Traceback (most recent call last):
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/bin/gimme", line 11, in <module>
    cli(sys.argv[1:])
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/lib/python3.9/site-packages/gimmemotifs/cli.py", line 748, in cli
    args.func(args)
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/lib/python3.9/site-packages/gimmemotifs/commands/motifs.py", line 111, in motifs
    denovo = gimme_motifs(
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/lib/python3.9/site-packages/gimmemotifs/denovo.py", line 625, in gimme_motifs
    background = create_backgrounds(
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/lib/python3.9/site-packages/gimmemotifs/denovo.py", line 330, in create_backgrounds
    create_background(
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/lib/python3.9/site-packages/gimmemotifs/denovo.py", line 279, in create_background
    f = Fasta(bg_file)
  File "/mnt/sda/alberto/programs/miniconda3/envs/gimme_venv/lib/python3.9/site-packages/gimmemotifs/fasta.py", line 25, in __init__
    raise IOError("Not a valid FASTA file")
OSError: Not a valid FASTA file

I am unable to detect any problem with the FASTA file of the genome:

(gimme_venv) alberto@nostromo:/mnt/sda/alberto/projects/ananse_smed/outputs/gimme$ head /mnt/sda/alberto/DATA/dynamic/genomes/Smed/Guo/Smed.fasta
>smed_chr1
TCACTTTTCCAATTCAAAAATTGTAGTAATCGCGTTGGTTTGTTCCCCAAAGAGGTGATC
AATCTCTCGCCAGGAACTCCCCAACATTTGCGACGGATCCTACTACCTTCCTCTCCTCGG
GTTGATCAAGGTTCCCCATCGGAATAAGCGTACTATCGTTATGATAGCTCGGTTTTAAAT
CTTACTTTCCAGTATTATTTCCATATAATATAAATAACATTACAAAATATACTAACAAAA
TGATGCATCGAAACTTTTACCGTACCTTATGTGTGATTTTTCAGCTTTCACTGTTTAAGT
TGTGTTTTACTTGTCCACTCAGAATTTATTTAACGGCTTTCATGAAGTTTTCGGCTTTTT
TCCGTATGTTTTACACCTTTCGGTTGCTATTGTTGTCTCATTTATTATTATTATTAATCT
CAACTCATCGTACTCGTTTTTTTTAATCATCATTTTCCTCTGGTTGCTAACTATTATTCT
CGTAATTCATTATTATCTCTAATCCTTTATTTTCATTAATAACAGGCATCATCATTTTAT
(gimme_venv) alberto@nostromo:/mnt/sda/alberto/projects/ananse_smed/outputs/gimme$ grep ">" /mnt/sda/alberto/DATA/dynamic/genomes/Smed/Guo/Smed.fasta
>smed_chr1
>smed_chr2
>smed_chr3
>smed_chr4
>SmedMT

Correct me if I am wrong, but I understand the error refers to trying to pass the background peaks as a .bed file. Is it possible to do this, or must the background peak file be in .fasta format? I could not find any related info in the documentation.

Thanks in advance,

Alberto

simonvh commented 2 years ago

Hi @apposada, this is something that should work. However, I think this is a bug that was introduced at the time I simplified the gimme motifs command-line. At a first look I could not find how to fix this, so this will need some more detailed bug hunting. Will get back to you!