Closed martynakgajos closed 4 weeks ago
Thank you for reporting this issue, we are going to update the installation documentation, as there seems to be an issue in the documentation, making the matrix-clustering tool not working.
On 5 Feb 2020, at 15:21, Martyna Gajos notifications@github.com wrote:
Hi,
I am trying to run the matrix-clustering however I cannot figure out what is wrong with my input.
I had also some problems during installation, but in the manual it is written that it can happen and the software should still work (btw I have also tried the conda package and it hasn't worked for me yet).
I am using the following command: matrix-clustering -matrix PP inputfile.meme meme -o outputdir
I get the following errors:
rsync: change_dir "/home/gajos/Programs/rsat/public_html/images/program_icons" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] ; WARNING Matrix file /project/motif.meme does not contain any matrix in meme format. Please check format. {... more errors as the result of this one} Error in library(pkg, warn.conflicts = FALSE, character.only = TRUE, lib.loc = c(dir.rsat.rlib, : there is no package called ‘amap’ Calls: suppressPackageStartupMessages -> withCallingHandlers -> library Execution halted Error OpenInputFile: File _tables/clusters.tab does not exist.
I can attached the input file I am using. I think it is properly formated but maybe I am wrong.
Could you help me solve it or reference a tutorial that I should read?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rsa-tools/rsat-code/issues/1?email_source=notifications&email_token=ACCNWMPP2YLYVUSES5IUTEDRBLDMZA5CNFSM4KQL47WKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ILHCPPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCNWMK3GLNTTZM5BPPOMB3RBLDMZANCNFSM4KQL47WA.
Dear Martyna
Thank you for the feedback and sorry for the problems you faced with the installation. I think the problem was coming from a failure to install R 3.6, which resulted in the use of an older version (3.4) where some librairies were missing.
I fixed the problem on rsat-code but not yet on the conda package.
Could you please check if the installation works properly on your side ?
In principle you would be able to do it in the following way:
git clone https://github.com/rsa-tools/rsat-code.git
rsync -ruptvl rsat-code/installer $RSAT/ cd $RSAT
sudo bash source RSAT_config.bashrc export MY_OS=ubuntu
source RSAT_config.bashrc && \ bash installer/07_R-and-packages.bash
Best regards,
Jacques
Aix-Marseille Université (AMU). Lab. Theory and Approaches of Genomic Complexity (TAGC) INSERM Unit UMR_S 1090, 163, Avenue de Luminy, 13288 MARSEILLE cedex 09. France Office: INSERM building, block 6 Fax: +33 4 91 82 87 01
Jacques.van-Helden@univ-amu.fr https://orcid.org/0000-0002-8799-8584
On 5 Feb 2020, at 08:21, Martyna Gajos notifications@github.com wrote:
Hi,
I am trying to run the matrix-clustering however I cannot figure out what is wrong with my input.
I had also some problems during installation, but in the manual it is written that it can happen and the software should still work (btw I have also tried the conda package and it hasn't worked for me yet).
I am using the following command: matrix-clustering -matrix PP inputfile.meme meme -o outputdir
I get the following errors:
rsync: change_dir "/home/gajos/Programs/rsat/public_html/images/program_icons" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] ; WARNING Matrix file /project/motif.meme does not contain any matrix in meme format. Please check format. {... more errors as the result of this one} Error in library(pkg, warn.conflicts = FALSE, character.only = TRUE, lib.loc = c(dir.rsat.rlib, : there is no package called ‘amap’ Calls: suppressPackageStartupMessages -> withCallingHandlers -> library Execution halted Error OpenInputFile: File _tables/clusters.tab does not exist.
I can attached the input file I am using. I think it is properly formated but maybe I am wrong.
Could you help me solve it or reference a tutorial that I should read?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rsa-tools/rsat-code/issues/1?email_source=notifications&email_token=ACT3M2RSBBQSR3Z7JEPALKDRBLDMZA5CNFSM4KQL47WKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ILHCPPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACT3M2V2FUJG5PQVQB2RATLRBLDMZANCNFSM4KQL47WA.
Thank you for a fast answer. Unfortunately, the error still occurs.
1) I believe there might be something wrong with the installation of perl packages as it appears already during reading the matrix file that I am giving and it is done by .../rsat/perl-scripts/convert-matrix function . The problem might also be occurring because my .meme file might be malformated. Here is an example how my meme files look like:
MEME version 4
ALPHABET= ACGT
Background letter frequencies A 0.263173 C 0.23813 G 0.224177 T 0.27452
MOTIF CACGCCAA letter-probability matrix: alength= 4 w= 8 nsites= 909 bg_prob= 0.00001528 opt_bg_order= 2 log(Pval)= -7.49645185 0.00000018 0.99999982 0.00000001 0.00000001 0.99999988 0.00000013 0.00000001 0.00000001 0.01502429 0.98493105 0.00000001 0.00004464 0.00000002 0.00002313 0.99997681 0.00000008 0.00000001 0.99999809 0.00000194 0.00000001 0.00000024 0.90720397 0.00000001 0.09279580 0.99988377 0.00000149 0.00002128 0.00009340 0.99999881 0.00000065 0.00000056 0.00000001
MOTIF AAGGCGGC letter-probability matrix: alength= 4 w= 8 nsites= 789 bg_prob= 0.00001728 opt_bg_order= 2 log(Pval)= -6.42772436 0.95745504 0.00000011 0.00957859 0.03296627 0.99978441 0.00021301 0.00000255 0.00000001 0.00000009 0.01020189 0.98568732 0.00411074 0.00000015 0.00000112 0.99990678 0.00009197 0.00070567 0.99928975 0.00000441 0.00000017 0.00794884 0.00096019 0.99108815 0.00000275 0.00000001 0.00078741 0.94543123 0.05378127 0.00000002 0.99579847 0.00420156 0.00000003
2) I was trying to circumvent the problem, convert matrices myself to a desired format and just run the .../rsat/R-scripts/matrix-clustering.R on the matrices. However, I have a problem to understand the log file that I am getting (trying to run the convert-matrix function): after the matrices are split into separate files (.../rsat/perl-scripts/convert-matrix -i input_path.tf -split -from tf -to tf -o output_directory
), the next step is just running the .../rsat/R-scripts/matrix-clustering.R . The .../rsat/R-scripts/matrix-clustering.R requires pairwise_compa.tab that has never been created. Should I prepare the comparison table using compare-matrices first?
Hi
I found the problem.
Here is an example of MEME-ChIP output motifs:
http://meme-suite.org/doc/examples/memechip_example_output_files/combined.meme
In your motifs there is a missing parameter (E) that is required in convert-matrix to convert the formats.
Remove this in your header:
bg_prob= 0.00001528 opt_bg_order= 2
And replace
log(Pval)= -7.49645185
by
E= -7.49645185
I used the following input in convert-matrix and it works:
MEME version 4
ALPHABET= ACGT
Background letter frequencies
A 0.263173 C 0.23813 G 0.224177 T 0.27452
MOTIF CACGCCAA
letter-probability matrix: alength= 4 w= 8 nsites= 909 E= -7.49645185
0.00000018 0.99999982 0.00000001 0.00000001
0.99999988 0.00000013 0.00000001 0.00000001
0.01502429 0.98493105 0.00000001 0.00004464
0.00000002 0.00002313 0.99997681 0.00000008
0.00000001 0.99999809 0.00000194 0.00000001
0.00000024 0.90720397 0.00000001 0.09279580
0.99988377 0.00000149 0.00002128 0.00009340
0.99999881 0.00000065 0.00000056 0.00000001
MOTIF AAGGCGGC
letter-probability matrix: alength= 4 w= 8 nsites= 789 E= -6.42772436
0.95745504 0.00000011 0.00957859 0.03296627
0.99978441 0.00021301 0.00000255 0.00000001
0.00000009 0.01020189 0.98568732 0.00411074
0.00000015 0.00000112 0.99990678 0.00009197
0.00070567 0.99928975 0.00000441 0.00000017
0.00794884 0.00096019 0.99108815 0.00000275
0.00000001 0.00078741 0.94543123 0.05378127
0.00000002 0.99579847 0.00420156 0.00000003
I will check if the MEME motifs in the current version of meme 5.1 have a new header and update convert-matrix if necessary
great, now I am clearly on the right path, as the error I get comes from the R package:
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
Error in library(pkg, warn.conflicts = FALSE, character.only = TRUE, lib.loc = c(dir.rsat.rlib, :
there is no package called ‘TFBMclust’
Calls: suppressPackageStartupMessages -> withCallingHandlers -> library
Execution halted
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
I have exchanged the lines:
for (package in required.packages.rsat) {
message("Installing RSAT package ", package, " in folder ", install.dir)
install.packages(pkgs=file.path(dir.rsat.rscripts, package), repos=NULL, lib=install.dir, type="source")
}
with
for (package in required.packages.rsat) {
message("Installing RSAT package ", package, " in folder ", install.dir)
install.packages(pkgs=file.path(dir.rsat.rscripts, package), repos=NULL, type="source")
}
and it works now. However, out of 111 motifs, I am getting one cluster. Is it possible to add some parameters to increase the number of clusters?
Yes,
These are the default parameters, that maximize the grouping of motifs from the same TF family:
-hclust_method average -calc sum -metric_build_tree Ncor -lth w 5 -lth cor 0.6 -lth Ncor 0.4
In case you want something more stringent try the next ones:
-lth cor 0.75 -lth Ncor 0.55
or
-lth cor 0.8 -lth Ncor 0.6
You can find more info in the paper: https://doi.org/10.1093/nar/gkx314
Hi,
how is the conda release going? ;) I am trying to cluster motifs from RNA bind motifs database ATtRACT (https://attract.cnic.es/). I have downloaded the database and converted PFMs to meme format (as I have already managed to run the clustering for meme input). Here is how the beginning of the file looks like:
MEME version 4
ALPHABET= ACGU
Background letter frequencies
A 0.25 C 0.25 G 0.25 U 0.25
MOTIF 904 904
letter-probability matrix: alength= 4 w= 5 nsites= 100000 E= -5.0
0.00961538461538 0.00961538461538 0.00961538461538 0.971153846154
0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538
0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538
0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538
0.971153846154 0.00961538461538 0.00961538461538 0.00961538461538
MOTIF s36 s36
letter-probability matrix: alength= 4 w= 7 nsites= 100000 E= -5.0
0.844325153374 0.000766871165644 0.154141104294 0.000766871165644
0.0774539877301 0.690950920245 0.0774539877301 0.154141104294
0.000766871165644 0.0774539877301 0.154141104294 0.76763803681
0.76763803681 0.0774539877301 0.000766871165644 0.154141104294
0.0774539877301 0.0774539877301 0.76763803681 0.0774539877301
0.230828220859 0.614263803681 0.0774539877301 0.0774539877301
0.460889570552 0.230828220859 0.154141104294 0.154141104294
(I have added dummies for the number of sites and E-values, to check if it works, as I haven't found the values in the database.) I get the following message:
/home/gajos/Programs/rsat/perl-scripts/convert-matrix -i output/meme.meme -from meme -to tf -o output/_data/RNA_input_motifs_processed_1.tf /home/gajos/Programs/rsat/bin/rsat:65: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. ref = yaml.load(open(path + '/rsat.yaml').read())#, Loader=yaml.FullLoader) sh: ./R: Is a directory Error OpenInputFile: File output/_tables/clusters.tab does not exist.
What got created are:
The file output/_tables/clusters.tab indeed does not exist. What might be a problem in this case?
Dear Martyna,
I will let Jacques answer regarding the conda issue.
Meanwhile, the Attract database has already been included in RSAT, and is ready to use : https://rsat01.biologie.ens.fr/rsat/motif_databases/ATtRACT/ATtRACT_2017_12.tf
Kind regards,
Morgane
On 8 May 2020, at 16:25, Martyna Gajos notifications@github.com wrote:
Hi,
how is the conda release going? ;) I am trying to cluster motifs from RNA bind motifs database ATtRACT (https://attract.cnic.es/ https://attract.cnic.es/). I have downloaded the database and converted PFMs to meme format (as I have already managed to run the clustering for meme input). Here is how the beginning of the file looks like:
MEME version 4 ALPHABET= ACGU
Background letter frequencies A 0.25 C 0.25 G 0.25 U 0.25
MOTIF 904 904 letter-probability matrix: alength= 4 w= 5 nsites= 100000 E= -5.0 0.00961538461538 0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538 0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538 0.00961538461538 0.00961538461538 0.971153846154 0.00961538461538 0.971153846154 0.00961538461538 0.00961538461538 0.00961538461538
MOTIF s36 s36 letter-probability matrix: alength= 4 w= 7 nsites= 100000 E= -5.0 0.844325153374 0.000766871165644 0.154141104294 0.000766871165644 0.0774539877301 0.690950920245 0.0774539877301 0.154141104294 0.000766871165644 0.0774539877301 0.154141104294 0.76763803681 0.76763803681 0.0774539877301 0.000766871165644 0.154141104294 0.0774539877301 0.0774539877301 0.76763803681 0.0774539877301 0.230828220859 0.614263803681 0.0774539877301 0.0774539877301 0.460889570552 0.230828220859 0.154141104294 0.154141104294 (I have added dummies for the number of sites and E-values, to check if it works, as I haven't found the values in the database.) I get the following message:
/home/gajos/Programs/rsat/perl-scripts/convert-matrix -i output/meme.meme -from meme -to tf -o output/_data/RNA_input_motifs_processed_1.tf /home/gajos/Programs/rsat/bin/rsat:65: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load https://msg.pyyaml.org/load for full details. ref = yaml.load(open(path + '/rsat.yaml').read())#, Loader=yaml.FullLoader) sh: ./R: Is a directory Error OpenInputFile: File output/_tables/clusters.tab does not exist.
What got created are:
tf files for every motif (in output /_data/), pairwise_compa.tab and pairwise_compa_matrix_descriptions.tab (in output /_table/). The file output/_tables/clusters.tab indeed does not exist. What might be a problem in this case?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rsa-tools/rsat-code/issues/1#issuecomment-625841139, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCNWMOBOQ524RF7FBDMN73RQQI53ANCNFSM4KQL47WA.
Hi Morgane,
thank you for the answer. My problem still exists, even if I use the transfac file (https://rsat01.biologie.ens.fr/rsat/motif_databases/ATtRACT/ATtRACT_2017_12.tf).
For @martynakgajos and other facing similar problems, the conda version of RSAT is outdated. Instead, the Docker container is being regularly updated and can be run as explained at https://rsa-tools.github.io/installing-RSAT/RSAT-Docker/RSAT-Docker-tuto.html
Hi,
I am trying to run the matrix-clustering however I cannot figure out what is wrong with my input.
I had also some problems during installation, but in the manual it is written that it can happen and the software should still work (btw I have also tried the conda package and it hasn't worked for me yet).
I am using the following command: matrix-clustering -matrix PP inputfile.meme meme -o outputdir
I get the following errors:
rsync: change_dir "/home/gajos/Programs/rsat/public_html/images/program_icons" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] ; WARNING Matrix file /project/motif.meme does not contain any matrix in meme format. Please check format. {... more errors as the result of this one} Error in library(pkg, warn.conflicts = FALSE, character.only = TRUE, lib.loc = c(dir.rsat.rlib, : there is no package called ‘amap’ Calls: suppressPackageStartupMessages -> withCallingHandlers -> library Execution halted Error OpenInputFile: File _tables/clusters.tab does not exist.
I can attached the input file I am using. I think it is properly formated but maybe I am wrong.
Could you help me solve it or reference a tutorial that I should read?