Add script to renumber a specific chain in a PDB

haddocking / pdb-tools

A dependency-free cross-platform swiss army knife for PDB files.

https://haddocking.github.io/pdb-tools/

Apache License 2.0

383 stars 113 forks source link

Add script to renumber a specific chain in a PDB #4

Closed mtrellet closed 5 years ago

mtrellet commented 7 years ago

This script allows to renumber a specific chain in a PDB without editing other chains. It avoids the multistep approach consisting of extracting first the chain to renumber then to concatenate it back to the rest of the PDB
Takes file or stream as input
Merging of pdb_reres and pdb_selchain scripts

JoaoRodrigues commented 7 years ago

This is a different repo, I know, but I would really be against this sort of change (merging utilities). The point is to have things clearly separated and non-redundant. On the other hand, there is a pending pull request on my end to add more options to pdb_reres.py to make it restart numbering at each chain.

Why not a pdb_delchain.py script and edit pdb_join.py to be able to merge chains into a single PDB (instead of multi-model as it is now)? I know, more work, but would keep things cleaner on the long run.

JoaoRodrigues commented 7 years ago

Also, I would keep HADDOCK specific things separated from this code (check against known resnames). These are generic tools.

amjjbonvin commented 7 years ago

I do like the idea of renumbering a specific chain… But then you should also add another script to shift the numbering of a specific chain (which keeps the gap and to my opinion is more useful).

mtrellet commented 7 years ago

@JoaoRodrigues : I don't mind adding this option to pdb_reres, I was just mad at doing multiple manipulations of PDBs each time I wanted to renumber a single chain in a multimer..! For the second point you raised, there is no "resname check in this script, and the 2nd script (pdb_sctrict_format.py) that appears in the commits has been deleted by my last commit since this script is now part of haddock-tools ;)

@amjjbonvin I can indeed add a "--gap" or "--shift" option to keep the gaps! Good point.

mtrellet commented 7 years ago

@JoaoRodrigues Btw, pdb_delchain.py and pdb_join.py with a chain option would be a nice addition as well. I went for the quick and lazy option here even if I really like the fact that it requires only one script right now...

JoaoRodrigues commented 7 years ago

There is a pdb_shiftres.py already. We can deprecate it and add options to pdb_reres.py like this:

-chain A,B,C: renumbers (a) specific chain(s)
-shift: adds instead of restarting
-gap: preserves gaps

Also, like @joaomcteixeira raised in the other repo, we should have a check for resids higher than 9999.

amjjbonvin commented 7 years ago

Shift and gap seems redundant…

JoaoRodrigues commented 7 years ago

Shift and gap seems redundant…

True.

mtrellet commented 7 years ago

Before I make the modifications to pdb_reres.py to take the new options into account, @Adrimel suggested not to be case-sensitive for the chain identifiers in the arguments (he was using pdb_selchain.py), sounds ok to you?

JoaoRodrigues commented 7 years ago

I'm also on a new breed of pdb_reres.py. Let's see what we both create :)

As for chain identifiers, they should remain case sensitive. Larger PDBs have uppercase and lowercase chains (you get 2x26 max chains like that).

amjjbonvin commented 7 years ago

On our side, cns/xplor are not case sensitive - and probably pymol and chimera neither

JoaoRodrigues commented 7 years ago

Pymol can be case sensitive. What would be the argument against case sensitivity?

amjjbonvin commented 7 years ago

I really don't think the PDB official format is...

Adrimel commented 7 years ago

The valid chain identifiers for the PDB format are [A-Za-z0-9], which allows up to 62 identifiers. And PyMOL has an option ignore_case (by default set to ON) that can be switch OFF to make it case sensitive (https://pymolwiki.org/index.php/Ignore_case) Therefore, thumbs up @JoaoRodrigues

mtrellet commented 6 years ago

I've seen @JoaoRodrigues has made some nice changes to his repo to take into account what we suggested here, should not we update the haddocking master and decline/close this PR?

amjjbonvin commented 6 years ago

Yes - usually João takes care of that

JoaoRodrigues commented 6 years ago

I didn't because this will break compatibility with previous version of the scripts, and because I want to uniformize the inputs of all scripts (to make them consistent).

I'd check where pdb_reres is being used.

mtrellet commented 5 years ago

So we leave the new version of pdb_reres.py from @JoaoRodrigues aside and keep the current one, right? We still don't have a script to renumber a particular chain (or should I say, I do have one that I use on a daily basis but none in the official repo..!). Since this PR is really old and my new commits end up here, I'll just close the PR today. Except if someone vetoes it of course ;)

mtrellet commented 5 years ago

Ok, we agreed with @JoaoRodrigues, let's close this one, we'll work on a better way to integrate this feature ;)