haddocking / haddock3

Official repo of the modular BioExcel version of HADDOCK
https://www.bonvinlab.org/haddock3
Apache License 2.0
84 stars 28 forks source link

Flexible segments definition #919

Open AnnaKravchenko opened 3 days ago

AnnaKravchenko commented 3 days ago

Definition of flexible/semi-flexible segments in refinement modules is somewhat enigmatic: One has to use parameter nfleX with X matching a sequential number of the molecule to be flexible, i.e. if molecules = [‘molecule1’, ‘molecule2’] and one wants 1 segment of molecule2 to be flexible, one needs to define parameter nfle2 = 1 and not, for example, nfle1, nfle3 etc. This way of defining a segment feels much less intuitive compared to a definition of the symmetrical segments. Plus it’s not at all explained at www.bonvinlab.org/haddock3/

Would it be possible to define flexible and semi-flexible segments similarly to the symmetry segment, i.e. using chain/segment id? Let’s say mol2.pdb contains chain B, then, current definition of two flexible segments looks like:

molecules = [“mol1.pdb”, “mol2.pdb”]
# molecule 2 has 2 flexible segments
nfle2 = 2 
# 1st flexible segment of molecule 2 starts with residue 1
fle_sta_2_1 = 1 
# 1st flexible segment of molecule 2 ends with residue 4
fle_end_2_1 = 4 
# 2nd flexible segment of molecule 2 starts with residue 6
fle_sta_2_2 = 6 
# 2nd flexible segment of molecule 2 starts with residue 18
fle_sta_2_2 = 18 

Possible simplified definition of the flexible segment could look like:

# 1st flexible segment belongs to chain B
flex_seg_1 = ‘B’ 
# 1st flexible segment starts with residue 1 (within chain B)
flex_1_sta = 1 
# 1st flexible segment ends with residue 4 (within chain B)
flex_1_end = 4 
# 2st flexible segment belongs to chain B
flex_seg_2 = ‘B’ 
# 2nd flexible segment starts with residue 6 (within chain B)
flex_2_sta = 6 
# 2nd flexible segment ends with residue 18 (within chain B)
flex_2_end = 18 

Alternatively, a better description of current semi/flexibility definition should be provided.

Current definition:

nfle1
default: 0
type: integer
title: Number of fully flexible segments
min: 0
max: 1000
short description: This defines the number of fully flexible segments.
long description: This parameter defines the number of fully flexible 
segments for the specified molecule. If >=1 then those must be defined 
manually with starting and end residue numbers in the fle_sta_* and 
fle_end_* variables.
group: flexibility
explevel: expert

Possible enhanced definition:

nfle1
default: 0
type: integer
title: Number of fully flexible segments in the 1st molecule
min: 0
max: 1000
short description: This defines the number of fully flexible segments 
of the 1st molecule.
long description: This parameter defines the number of fully flexible 
segments for the molecule that is the 1st entry in the ‘molecules' 
parameter. If >=1 then those must be defined manually with starting 
and end residue numbers in the fle_sta_* and fle_end_* variables.
group: flexibility
explevel: expert
amjjbonvin commented 3 days ago

It would be possible but requires a lot of refactoring of the CNS code…

Also, this syntax is used for some rather undocumented option to limit the random AIRs definition to selected segments per molecule.

I.e. I would go for a better description of the parameters.