eggzilla / RNAlien

RNAlien - unsupervised RNA family model construction
http://rna.tbi.univie.ac.at/rnalien/
GNU General Public License v3.0
14 stars 2 forks source link

ClustalW problem? ERROR: You need at least two sequences in the alignment. #10

Closed riasc closed 7 years ago

riasc commented 7 years ago

Hi Florian,

Unfortunately, I have to bug you again. I just was executing the following command:

>RNAlien -i input.fasta -b /data/db/blast/nt -c 20 -t 562 -o . -d construction1

with the following input file:

>AARQ02000011.1/391-585
AAUUGAAUAGAAGCGCCAGAACUGAUUGGGACGAAAAUGCUUGAAGGUGAAAUCCCUGAA
AAGUAUCGAUCAGUUGACGAGGAGGAGAUUAAUCGAAGUUUCGGCGGGAGUCUCCCGGCU
GUGCAUGCAGUCGUUAAGUCUUACUUACAAAUCAUUUGGGUGACCAAGUGGACAGAGUAG
UAAUGAAACAUGCUU

However, I'm getting the following error:

ERROR: You need at least two sequences in the alignment.

Skipping alignment. There must be at least three sequences in the alignment.

I guess this is a problem with the alignment by Clustal Omega? I'm using Clustal Omega 1.2.4

eggzilla commented 7 years ago

Hey Richard, thanks for reporting and apologies for the bug :-) I will look into it and get back to you.

riasc commented 7 years ago

Have you found the time to look into that problem? I tried some other sequences where on some the same error occured. However, I executed the same without directly specifying the blastdb and got the this:

RNAlien -i input.fasta  -c 20 -t 562 -o . -d construction3
2 sequences; length of alignment 196.
2 sequences; length of alignment 198.
2 sequences; length of alignment 197.
2 sequences; length of alignment 197.
2 sequences; length of alignment 198.
(-50.4,-50.4,0.989795918367347)
(-52.5,-52.45,0.9137055837563451)
(-47.95,-48.9,0.9289340101522842)
(-48.05,-49.599999999999994,0.9390862944162437)
(-52.15,-51.25,0.8939393939393939)
RNAlien: Either.Unwrap.fromRight: Argument takes form 'Left _'
CallStack (from HasCallStack):
  error, called at ./Data/Either/Unwrap.hs:52:23 in either-unwrap-1.1-2Lqm42fOpVz7XVLWzETZKJ:Data.Either.Unwrap

Does not RNAlien retrieve the blast hits by itself, if not specified in the command? Thanks

eggzilla commented 7 years ago

Hey, sorry for the delay. There was a problem with changed output in mlocarna that the resulting clustal alignment with consensus secondary structure is now outputed in a multi-line format. I modified my parsing library and updated the new version 1.3.1 of RNAlien accordingly. RNAlien uses the NCBI REST interface to perform its blast searches, which means you can specify the requested blast database via the -b commandline switch (e.g. NT, refseq_genomic,..), but not a filepath to a BLAST database. If you do not specify a command it will default to the nt database. You can RNAlien --help to see the default values. However during testing the new patch I could observe some erratic behavoir regarding the blast requests, but I need more time to investigate.

riasc commented 7 years ago

Hi Florian, thanks for the reply. As for now its maybe the best to run RNAlien with the parameters from the benchmarks (Clustalo version: 1.2.4, mlocarna version: LocARNA 1.9.0, RNAfold version: RNAfold 2.3.1, infernalversion: # INFERNAL 1.1.2 (July 2016)). What is the lastest version to be working with these parameters? 1.2.9? I would then create a docker container so that I can at least try in once.

eggzilla commented 7 years ago

I am just updating the bioconda recipe for RNAlien, then you can use the corresponding biocontainer. Should be done today and contains Locarna 1.9.1, ViennaRNA 2.3.3, INFERNAL 1.1.2 RNAlien 1.3.2 I will write again when its done. Concerning the dependencies: Locarna 1.9.1, ViennaRNA 2.3.3 and INFERNAL 1.1.2 should work together with RNAlien >=1.3.1 Locarna <=1.8.11, ViennaRNA >= 2.2.9 and INFERNAL 1.1.2 should work with RNAlien >=1.3.0 Locarna <=1.8.11, ViennaRNA <= 2.2.9 and INFERNAL 1.1.2 should work with all RNAlien versions

eggzilla commented 7 years ago

RNAlien is now available as docker container with all dependencies via biocontainers :-) Link to repo: https://quay.io/repository/biocontainers/rnalien Docker commands: docker pull quay.io/biocontainers/rnalien:1.3.4--pl5.22.0_0 docker run -i -t quay.io/biocontainers/rnalien:1.3.4--pl5.22.0_0 bash If it works for you now I will close the issue.

riasc commented 7 years ago

Hi, I tried to run the container but keep getting a HttpExceptionRequest. Maybe it has to do with my setup? Exposing the ports does not help. Do you have any idea? Thanks

RNAlien -c 20 -o /home/slott/RNAlien_docker_iterator/ -b nt -i /home/slott/RNAlien_docker_iterator/tmp_in.fasta -d RF02438 -t 1035187

RNAlien: HttpExceptionRequest Request {
  host                 = "www.ncbi.nlm.nih.gov"
  port                 = 443
  secure               = True
  requestHeaders       = []
  path                 = "/"
  queryString          = ""
  method               = "GET"
  proxy                = Nothing
  rawBody              = False
  redirectCount        = 10
  responseTimeout      = ResponseTimeoutDefault
  requestVersion       = HTTP/1.1
}
 (InternalException (HandshakeFailed (Error_Protocol ("certificate has unknown CA",True,UnknownCa))))
eggzilla commented 7 years ago

Hi, sorry for the inconvinience. It is only possible to connect to NCBI services via https anymore. Therefore the certificate is checked on connection. This error message means that the authorized certificates of your operating-system installation does not recognize the certificate authority that has signed NCBIs certificate. Should propably be enough to update certificates via package manager, eg. on ubuntu/debian: sudo apt-get --only-upgrade install ca-certificates

eggzilla commented 7 years ago

And what i forgot to mention, you could try the biocontainer for RNAlien :-) https://quay.io/repository/biocontainers/rnalien

riasc commented 7 years ago

Thanks for the reply. Actually the problem occurs when I'm in the RNAlien biocontainer. Well I can connect ncbi with my host, but just not when I'm in the container. I'm not familiar with buildroot, but it seems a bit more effort to install packages (e.g., openssl). So the problem should be the missing certificate in the container?

eggzilla commented 7 years ago

Mhm, ok, if you already use the RNAlien container and you get this message then somethin with the container is broken :-( I will check where the problem is.

riasc commented 7 years ago

Thanks. Have you figured out the problem? I wanted to call RNAlien directly via docker exec. However, in the meantime I created a container where RNAlien runs quite well, although there is some overload (https://hub.docker.com/r/riasc/rnalien-workbench/). Thanks for your effort.

eggzilla commented 7 years ago

Hi, yeah, I did not explicitly request the installation of the certificates in the bioconda recipe, so they were missing from the busybox that biocontainers is using. The problem is now that in the solutions I tested, I have to provide a wrapper script that replaces the RNAlien executable and sets a enviromental variable to the non-standard certificate folder. I am still looking for a more elegant solution. Thanks for building a container and sorry that my solutions was not working.