soedinglab / hh-suite

Remote protein homology detection suite.
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3019-7
GNU General Public License v3.0
547 stars 135 forks source link

test result variance on a big endian (s390x) system #222

Open mr-c opened 4 years ago

mr-c commented 4 years ago

Expected Behavior

tests pass

Current Behavior

- 14:56:11.112 INFO: Search results will be written to query.hhr

- 14:56:11.167 INFO: query.a3m is in A2M, A3M or FASTA format

- 14:56:11.588 INFO: Alternative alignment: 0

- 14:56:11.643 INFO: 1 alignments done

- 14:56:11.655 INFO: Realigning 1 HMM-HMM alignments using Maximum Accuracy algorithm

Query         sp|Q5VUD6|FA69B_HUMAN Protein FAM69B OS=Homo sapiens GN=FAM69B PE=2 SV=3
Match_columns 431
No_of_seqs    49 out of 118
Neff          4.29223
Searched_HMMs 1
Date          Wed Sep  2 14:56:11 2020
Command       hhalign -i query.a3m -t query.a3m 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  9E-168  9E-168 1246.9   0.0  431    1-431     1-431 (431)

query.a3m   0   32127   464 0
query.a3m   0   32666   2   0
query.a3m   0   32666   86  0
Reading context library for pseudocounts from internal ...
Reading abstract state alphabet from internal ...
Processing entry: query.a3m
Adding cs-pseudocounts (admix=0.30) ...
- 14:56:13.088 INFO: Search results will be written to query.hhr

- 14:56:13.089 INFO: Searching 1 column state sequences.

- 14:56:13.142 INFO: query.a3m is in A2M, A3M or FASTA format

- 14:56:13.166 INFO: Iteration 1

- 14:56:13.522 INFO: Prefiltering database

- 14:56:13.890 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 1

- 14:56:13.894 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 1

- 14:56:13.894 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 1

- 14:56:13.894 INFO: Scoring 1 HMMs using HMM-HMM Viterbi alignment

- 14:56:13.924 INFO: Alternative alignment: 0

- 14:56:13.979 INFO: 1 alignments done

- 14:56:13.979 INFO: Alternative alignment: 1

- 14:56:14.036 INFO: 1 alignments done

- 14:56:14.036 INFO: Alternative alignment: 2

- 14:56:14.090 INFO: 1 alignments done

- 14:56:14.090 INFO: Alternative alignment: 3

- 14:56:14.146 INFO: 1 alignments done

- 14:56:15.359 INFO: Premerge done

- 14:56:15.359 INFO: Realigning 4 HMM-HMM alignments using Maximum Accuracy algorithm

- 14:56:15.412 INFO: 7 sequences belonging to 7 database HMMs found with an E-value < 0.001

Query         sp|Q5VUD6|FA69B_HUMAN Protein FAM69B OS=Homo sapiens GN=FAM69B PE=2 SV=3
Match_columns 431
No_of_seqs    59 out of 236
Neff          4.27431
Searched_HMMs 1
Date          Wed Sep  2 14:56:15 2020
Command       hhblits -i query.a3m -d single -blasttab blits_app_res -n 1 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  1E-167  5E-169 1249.5   0.0   12  330-341   336-347 (431)
  2 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  1E-167  5E-169 1249.5   0.0  431    1-431     1-431 (431)
  3 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  1E-167  5E-169 1249.5   0.0    6  293-298   311-316 (431)
  4 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  1E-167  5E-169 1249.5   0.0   13  336-348   330-342 (431)

- 14:56:15.432 INFO: Searching 1 column state sequences.

- 14:56:15.478 INFO: Thread 0   query.a3m

- 14:56:15.485 INFO: query.a3m is in A2M, A3M or FASTA format

- 14:56:15.510 INFO: Iteration 1

- 14:56:16.235 INFO: Prefiltering database

- 14:56:16.979 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 1

- 14:56:16.984 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 1

- 14:56:16.984 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 1

- 14:56:16.984 INFO: Scoring 1 HMMs using HMM-HMM Viterbi alignment

- 14:56:16.997 INFO: Alternative alignment: 0

- 14:56:17.050 INFO: 1 alignments done

- 14:56:17.050 INFO: Alternative alignment: 1

- 14:56:17.107 INFO: 1 alignments done

- 14:56:17.107 INFO: Alternative alignment: 2

- 14:56:17.164 INFO: 1 alignments done

- 14:56:17.164 INFO: Alternative alignment: 3

- 14:56:17.221 INFO: 1 alignments done

- 14:56:19.608 INFO: Premerge done

- 14:56:19.608 INFO: Realigning 4 HMM-HMM alignments using Maximum Accuracy algorithm

- 14:56:19.663 INFO: 7 sequences belonging to 7 database HMMs found with an E-value < 0.001

- 14:56:19.684 INFO: Search results will be written to query.hhr

- 14:56:19.742 INFO: query.a3m is in A2M, A3M or FASTA format

- 14:56:19.769 INFO: Searching 1 database HHMs without prefiltering

- 14:56:19.769 INFO: Iteration 1

- 14:56:20.167 INFO: Scoring 1 HMMs using HMM-HMM Viterbi alignment

- 14:56:20.202 INFO: Alternative alignment: 0

- 14:56:20.265 INFO: 1 alignments done

- 14:56:20.265 INFO: Alternative alignment: 1

- 14:56:20.327 INFO: 1 alignments done

- 14:56:20.327 INFO: Alternative alignment: 2

- 14:56:20.389 INFO: 1 alignments done

- 14:56:20.390 INFO: Alternative alignment: 3

- 14:56:20.451 INFO: 1 alignments done

- 14:56:21.832 INFO: Premerge done

- 14:56:21.832 INFO: Realigning 4 HMM-HMM alignments using Maximum Accuracy algorithm

- 14:56:21.891 INFO: 7 sequences belonging to 7 database HMMs found with an E-value < 0.001

Query         sp|Q5VUD6|FA69B_HUMAN Protein FAM69B OS=Homo sapiens GN=FAM69B PE=2 SV=3
Match_columns 431
No_of_seqs    49 out of 236
Neff          4.29341
Searched_HMMs 1
Date          Wed Sep  2 14:56:21 2020
Command       hhsearch -i query.a3m -d single -blasttab search_app_res 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  5E-169  5E-169 1249.5   0.0   11  331-341   337-347 (431)
  2 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  5E-169  5E-169 1249.5   0.0  431    1-431     1-431 (431)
  3 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  5E-169  5E-169 1249.5   0.0    8   46-53     49-56  (431)
  4 sp|Q5VUD6|FA69B_HUMAN Protein  100.0  5E-169  5E-169 1249.5   0.0   13  336-348   330-342 (431)

- 14:56:21.961 INFO: Thread 0   query.a3m

- 14:56:21.969 INFO: query.a3m is in A2M, A3M or FASTA format

- 14:56:21.993 INFO: Searching 1 database HHMs without prefiltering

- 14:56:21.993 INFO: Iteration 1

- 14:56:22.720 INFO: Scoring 1 HMMs using HMM-HMM Viterbi alignment

- 14:56:22.735 INFO: Alternative alignment: 0

- 14:56:22.783 INFO: 1 alignments done

- 14:56:22.783 INFO: Alternative alignment: 1

- 14:56:22.836 INFO: 1 alignments done

- 14:56:22.836 INFO: Alternative alignment: 2

- 14:56:22.889 INFO: 1 alignments done

- 14:56:22.889 INFO: Alternative alignment: 3

- 14:56:22.943 INFO: 1 alignments done

- 14:56:25.181 INFO: Premerge done

- 14:56:25.181 INFO: Realigning 4 HMM-HMM alignments using Maximum Accuracy algorithm

- 14:56:25.233 INFO: 7 sequences belonging to 7 database HMMs found with an E-value < 0.001

1c1
< sp|Q5VUD6|FA69B_HUMAN query   0.007   431 9   0   330 341 336 347 864.0
---
> sp|Q5VUD6|FA69B_HUMAN query   0.007   431 8   0   331 341 337 347 864.0
3c3
< sp|Q5VUD6|FA69B_HUMAN query   0.005   431 4   0   293 298 311 316 864.0
---
> sp|Q5VUD6|FA69B_HUMAN query   0.007   431 5   0   46  53  49  56  864.0

Steps to Reproduce (for bugs)

Build on a s390x system. If need be I can construct a Dockerfile that cross-builds and uses qemu for execution

HH-suite Output (for bugs)

https://buildd.debian.org/status/fetch.php?pkg=hhsuite&arch=s390x&ver=3.3.0%2Bds-4&stamp=1599058588&raw=0

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

Include as many relevant details about the environment you experienced the issue in.

milot-mirdita commented 4 years ago

Sorry, I think this issue will stay open for a while. I don't think we can really afford the time to investigate an endianness issue in a project we don't have any funding for. Thanks for finding and documenting the problem though!