eead-csic-compbio / coexpression_motif_discovery

Discovering cis-regulatory motifs in proximal promoters of plants in RSAT::Plants
GNU General Public License v3.0
2 stars 1 forks source link

Makefile not able to run in one go #1

Open Citugulia40 opened 3 years ago

Citugulia40 commented 3 years ago

Hi I am using docker image for analysis. But Makefile is not running in one go, it is stopping after first step.

rsat_user@cde778d90cbd:~/rsat_results$ make -f peak-motifs.mk Creating the result directories Retrieving 4 upstream sequences boundaries from the genes of module M11 ~/rsat_results/upstream1/regulonM11_up1.rm.fna ~/rsat_results/upstream2/regulonM11_up2.rm.fna ~/rsat_results/upstream3/regulonM11_up3.rm.fna ~/rsat_results/upstream4/regulonM11_up4.rm.fna Retrieving 4 upstream sequences boundaries from all Prunus persica genome

najlaksouri commented 3 years ago

Dear Citu, To run the makefile in one go you need to precise the target all. Means that you need to use the following command: $make -f peak-motifs.mk all

Citugulia40 commented 3 years ago

Dear Najla, Thanks for your kind response. I am now able to run make file in one go successfully. But I am receiving the error in the peak-motifs command. Please let me know if I am doing anything wrong or how can I resolve this?

rsat_user@cde778d90cbd:~/rsat_results$ make -f peak-motifs.mk all Creating the result directories Retrieving 4 upstream sequences boundaries from the genes of module M11 ~/rsat_results/upstream1/regulonM11_up1.rm.fna ~/rsat_results/upstream2/regulonM11_up2.rm.fna ~/rsat_results/upstream3/regulonM11_up3.rm.fna ~/rsat_results/upstream4/regulonM11_up4.rm.fna Retrieving 4 upstream sequences boundaries from all Prunus persica genome Creating 2 random clusters Replicate 1 Replicate 2 retrieve the different upstream sequence lengths from the random clusters Replicate 1 Replicate 2 Running peak-motifs for M11 within upstream1 sequence length: 392375, number of masked symbols: 50808 (12.95 percent of the sequences) sh: 1: Syntax error: word unexpected (expecting ")") Error OpenInputFile: File /home/rsat_user/rsat_results/upstream1/regulonM11.rm.fna.peaks-rm/results/composition/peaks_test_freq-1str-ovlp_1nt.tab does not exist. Error OpenInputFile: File /home/rsat_user/rsat_results/upstream1/regulonM11.rm.fna.peaks-rm/results/composition/peaks_test_freq-1str-ovlp_1nt.tab does not exist. sh: 1: Syntax error: word unexpected (expecting ")") Error

Thanks in advance

najlaksouri commented 3 years ago

Hi Citu, Are you running the analysis on Mac? because on Linux am not getting this error Could you please confirm that and we will try to figure out the problem

Thanks

Citugulia40 commented 3 years ago

Hi Najal, I am running this on Centos7 linux.

Thanks

brunocontrerasmoreira commented 3 years ago

Thanks @Citugulia40 , I have tested the container running Docker version 20.10.6, build 370c289 in two hosts: i) Ubuntu 18.04 -> works fine ii) macOS Big Sur 11.31.1 -> get the same errors your report

This is apparently related to the way arguments to scripts are handled by the shell within the container. We will investigate and produce a new container, but this will take us some time. Our only suggestion for the moment is to run it in Ubuntu, hope this helps, Bruno

brunocontrerasmoreira commented 3 years ago

Hi Najal, I am running this on Centos7 linux.

Thanks

In your running container, if you type ps, which shell is running?

Citugulia40 commented 3 years ago

Thank you so much. I will try it on Ubuntu. My shell is bash.

Citugulia40 commented 3 years ago

I have another question regarding de novo motif discovery using RSAT-peak motifs. I have a set of co-expressed genes and I have used your pipeline to discover de novo motifs in my genes of interest but I am not getting any significant motif in my gene list. Can you please recommend me any parameter that can be changed to get statistical significant motifs in my set of genes.

najlaksouri commented 3 years ago

Hi Citu, If i'm not wrong, i suppose that you are comparing between the significance of the motifs identified in your genes of interest and those of the control negative clusters.

Citugulia40 commented 3 years ago

Thanks.

  1. When I run the RSAT peak motifs, I have got highest significance for my genes of interest "k-mer sig= 2.19; evalue=0.0065" and when I have compared with negative clusters, the significance is falling between the negative clusters (not very high).
  2. I have 519 genes in my list.
  3. I don't have any exprerimentally verified motif for my set of genes.
  4. I have tried the same that you had mentioned in your paper. Yes, you are right, I will try the promoter region from -250bp to +100bp.
Citugulia40 commented 3 years ago

Hi, I have also tried -250bp to +100bp region but I am still not getting any significant motif in my set of genes. Can you suggest changing of any parameter so that I can get the significant motifs?

Thanks in advance

brunocontrerasmoreira commented 3 years ago

Good morning @Citugulia40 , at this point I have 2 suggestions: 1) Find yourself a good positive control, which can be a regulon/group of promoters known in the literature to be bound by the same transcription factor. This would be handy to validate the protocol and optimize in your setting. 2) Refine your cluster by using additional expression/GO data to split it into smaller clusters

Please let us know how that goes, Bruno