rcsb / symmetry

:ferris_wheel: Detect, analyze, and visualize protein symmetry
GNU Lesser General Public License v2.1
26 stars 16 forks source link

QuatSymm FASTA output fails #116

Open AntoniyaAleksandrova opened 2 years ago

AntoniyaAleksandrova commented 2 years ago

If I run: $ quatsymm-2.2.2/runQuatSymm.sh -J 2nwx --stats --fasta=2nwx.fasta

the program successfully outputs a summary of the result and the corresponding alignment in the designated fasta file. However, if I download the pdb structure from the PDB and attempt the same, i.e.:

$ quatsymm-2.2.2/runQuatSymm.sh -J 2nwx_pdb.pdb --stats --fasta=2nwx.fasta

I get the exact same summary of the results but no fasta file and the following error:

Name    Size    Subunits    Stoichiometry   Pseudostoichiometry Symmetry    Local   Method  SymmRMSD    SymmTMscore
2nwx_pdb.pdb    3   [A, C, B]   A3  false   C3  false   ROTATION    0.24    1.00
1033 [pool-2-thread-1] ERROR workers.QuatSymmWorker - Could not save results for 2nwx_pdb.pdb
java.lang.IllegalArgumentException: Illegal ResidueRange format:pdb_A
    at org.biojava.nbio.structure.ResidueRange.parse(ResidueRange.java:120) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.ResidueRange.parseMultiple(ResidueRange.java:142) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.SubstructureIdentifier.<init>(SubstructureIdentifier.java:110) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.align.client.StructureName.initFromPDB(StructureName.java:292) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.align.client.StructureName.init(StructureName.java:248) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.align.client.StructureName.<init>(StructureName.java:141) ~[quatsymm-2.2.2.jar:2.2.2]
    at writers.QuatSymmFastaWriter.lambda$writeResult$0(QuatSymmFastaWriter.java:34) ~[quatsymm-2.2.2.jar:2.2.2]
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_322]
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) ~[?:1.8.0_322]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_322]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_322]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_322]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_322]
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) ~[?:1.8.0_322]
    at writers.QuatSymmFastaWriter.writeResult(QuatSymmFastaWriter.java:35) ~[quatsymm-2.2.2.jar:2.2.2]
    at workers.QuatSymmWorker.run(QuatSymmWorker.java:94) [quatsymm-2.2.2.jar:2.2.2]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_322]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_322]
    at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]

I've tried this on both Mac (Monterey with Java version "18.0.1.1" 2022-04-22) and Linux with the following Java version:

$ java -version
openjdk version "1.8.0_322"
OpenJDK Runtime Environment (build 1.8.0_322-b06)
OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode)

I've tried this for a few other structures and have failed with them too. Do I need to re-format PDB structures in a different way?

josemduarte commented 2 years ago

I'm wondering if it is an absolute requirement for you to read PDB format? If not I'd recommend switching to mmCIF format files and see if the problem exists there. My guess is that this problem is related to PDB format.

AntoniyaAleksandrova commented 2 years ago

Actually, I had tried mmCIF and it also failed. Here:

$ quatsymm-2.2.2/runQuatSymm.sh -J 2nwx.cif  --stats --fasta=2nwx.fasta
Name    Size    Subunits    Stoichiometry   Pseudostoichiometry Symmetry    Local   Method  SymmRMSD    SymmTMscore
2nwx.cif    3   [A, C, B]   A3  false   C3  false   ROTATION    0.24    1.00
1267 [pool-2-thread-1] ERROR workers.QuatSymmWorker - Could not save results for 2nwx.cif
java.lang.IllegalArgumentException: Illegal ResidueRange format:cif_A
    at org.biojava.nbio.structure.ResidueRange.parse(ResidueRange.java:120) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.ResidueRange.parseMultiple(ResidueRange.java:142) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.SubstructureIdentifier.<init>(SubstructureIdentifier.java:110) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.align.client.StructureName.initFromPDB(StructureName.java:292) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.align.client.StructureName.init(StructureName.java:248) ~[quatsymm-2.2.2.jar:2.2.2]
    at org.biojava.nbio.structure.align.client.StructureName.<init>(StructureName.java:141) ~[quatsymm-2.2.2.jar:2.2.2]
    at writers.QuatSymmFastaWriter.lambda$writeResult$0(QuatSymmFastaWriter.java:34) ~[quatsymm-2.2.2.jar:2.2.2]
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_322]
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) ~[?:1.8.0_322]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_322]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_322]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_322]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_322]
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) ~[?:1.8.0_322]
    at writers.QuatSymmFastaWriter.writeResult(QuatSymmFastaWriter.java:35) ~[quatsymm-2.2.2.jar:2.2.2]
    at workers.QuatSymmWorker.run(QuatSymmWorker.java:94) [quatsymm-2.2.2.jar:2.2.2]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_322]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_322]
    at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
josemduarte commented 2 years ago

Ok thanks for checking. Then it is really odd that it doesn't happen for the -J 2nwx parameter. In theory that's just downloading the mmCIF file and parsing it. One other idea is that you have cached an older version of the file (it was re-released in October 2021, see https://www.rcsb.org/versions/2NWX). You may try to purge your local cache.

AntoniyaAleksandrova commented 2 years ago

If that were true, then if I download a structure I have never tried analyzing (in any way) and submit it, it should work, but it does not. I tried with 7bcw (that's also a fairly new structure so not that many versions) and the outcome was the same. I've also tried it on 3 different machines.

josemduarte commented 2 years ago

Thanks for trying that. Then my hypothesis was wrong too. This will require some debugging. I'll see if I can do it at some point.

AntoniyaAleksandrova commented 2 years ago

Hi José, have you had any luck with it yet?

josemduarte commented 2 years ago

Ok found the issue and submitted a pull request to fix it #117

AntoniyaAleksandrova commented 2 years ago

That's great, thank you so much! Can't wait for the release to start using it.

josemduarte commented 2 years ago

The fix is now in master. I could not make a new release yet, something failed.

In any case, in the meantime you can use the new one by cloning and then building (maven and javac required):

git clone https://github.com/rcsb/symmetry.git     # or git pull if you have it cloned already
mvn package

Then the tar.gz file will be in symmetry-tools/target/quatsymm-*.tar.gz. You can unpack it and then run runQuatSymm.sh

AntoniyaAleksandrova commented 2 years ago

Hi Jose,

Thanks for the update! I tried using maven in the past and realized it will take a fair amount of effort for me to do so successfully (I don’t have it set up anymore either). If you have it running and have the tar.gz version, can you please upload it for me here so I can simply download and use it? https://drive.google.com/drive/folders/1AofIijFwthbY_VnN4krcbPVJBDeVwydg

Once again thanks, Toni

On Fri, Jun 17, 2022 at 8:27 PM Jose Manuel Duarte @.***> wrote:

The fix is now in master. I could not make a new release yet, something failed.

In any case, in the meantime you can use the new one by cloning and then building (maven and javac required):

git clone https://github.com/rcsb/symmetry.git # or git pull if you have it cloned already mvn package

Then the tar.gz file will be in symmetry-tools/target/quatsymm-*.tar.gz. You can unpack it and then run runQuatSymm.sh

— Reply to this email directly, view it on GitHub https://github.com/rcsb/symmetry/issues/116#issuecomment-1159132595, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGVOCQZ323Y4Z4Y6LXX56VTVPS7SPANCNFSM5VWGVZQA . You are receiving this because you authored the thread.Message ID: @.***>

josemduarte commented 2 years ago

Ok uploaded.

For the record. Installing maven in linux (for a debian flavour like ubuntu) is:

sudo apt install maven
AntoniyaAleksandrova commented 2 years ago

Thanks! It wasn’t the installation of maven but successfully building ce-symm with it that tripped me up last time, if I recall correctly. It was not exactly trivial if you don’t have experience with it.

Toni

On Fri, Jun 17, 2022 at 8:39 PM Jose Manuel Duarte @.***> wrote:

Ok uploaded.

For the record. Installing maven in linux (for a debian flavour like ubuntu) is:

sudo apt install maven

— Reply to this email directly, view it on GitHub https://github.com/rcsb/symmetry/issues/116#issuecomment-1159140793, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGVOCQ7X7OACP5BHXQXCESDVPTA6FANCNFSM5VWGVZQA . You are receiving this because you authored the thread.Message ID: @.***>