rlabduke / MolProbity

Protein and nucleic acid validation service
Other
56 stars 30 forks source link

Clashes overwrite lowercase uppercase chain names #26

Closed awaterho closed 1 year ago

awaterho commented 1 year ago

Hello, we are using molprobity_4.4 from source, and when looking at the results of clashes in the output_probe.txt , we believe only the uppercase chain names are being used.

The proteins are of course large heteromers and having (in this case) 54 chains so that is all upper case, numeric then going through lower case chain names. I can see that lines such as P 89 TRP O : P 93 SER
should in fact be written with the correct residues for the chain: p 89 TRP O : p 93 SER

Is this a known issue?

awaterho commented 1 year ago

input.pdb.gz

chrissciwilliams commented 1 year ago

Thanks for using MolProbity.

MolProbity 4.4 did indeed have a bug where PDB lines would be forced to upper case during clash analysis. This has since been fixed.

Your best solution would be to get a fresh install of MolProbity (which will also include some new validations and more up-to-date commandline tools). Instructions are in the readme, but the short version is:

wget https://github.com/rlabduke/MolProbity/blob/master/install_via_bootstrap.sh ./install_via_bootstrap.sh 4 cd/molprobity ./setup.sh

If you really need to patch only the fix into your existing version, the fix is in Probe. Get and properly build a new version of Probe into MolProbity, and you should find case sensitivity being respected. I don't recommend this, as cross-version compatibility is always tricky and risky.

Alternatively, if you really can't change software, MolProbity does accept 2-character chain IDs. You could replace all the chain " a" names with "A2", for example, and the old system would parse those correctly.

Good luck, -Christopher Williams ---Richardson Lab, Duke University

On Mon, Feb 6, 2023 at 6:11 AM Andrew Waterhouse @.***> wrote:

Hello, we are using molprobity_4.4 from source, and when looking at the results of clashes in the output_probe.txt , we believe only the uppercase chain names are being used.

The proteins are of course large heteromers and having (in this case) 54 chains so that is all upper case, numeric then going through lower case chain names. I can see that lines such as P 89 TRP O : P 93 SER should in fact be written with the correct residues for the chain: p 89 TRP O : p 93 SER

Is this a known issue?

— Reply to this email directly, view it on GitHub https://github.com/rlabduke/MolProbity/issues/26, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLEREB4Y3ZTSHVIECUBJW3WWDL75ANCNFSM6AAAAAAUSRJB4E . You are receiving this because you are subscribed to this thread.Message ID: @.***>