ExcitedStates / qfit-3.0

qFit: Automated and unbiased multi-conformer models from X-ray and EM maps.
MIT License
30 stars 11 forks source link

New qfit ligand #426

Closed jessicaflowers closed 1 month ago

jessicaflowers commented 2 months ago

Pull Request Checklist


Description of the Change

The updated ligand sampling in qFit replaces the previous local_search() and internal_dof() methods with multiple conformer generation functions using RDKit. Now, users must provide a SMILES string (--smiles) along with the model, map, labels, and selection to run qFit. There is an option to specify the number of conformers to generate (--numConf), defaulting to 10,000. Additionally, there's an optional flag to enable MIQP solvers with BIC (--ligand_bic), which is set to false by default.

Our algorithm processes the input ligand through 3 to 6 conformer generation functions, depending on the ligand's structure. These include unconstrained, terminal atom constrained, and spherically constrained conformer generation. If the ligand features a side chain, a branching search is also conducted. For side chains longer than 30 atoms, a long chain search is performed. Conformers from these processes are pooled together, QP scored, and then subjected to further sampling through rotation and translation. The total number of conformers generated is distributed equally among the utilized functions. For example, if no side chains are present and only three functions are used prior to rotation and translation, each function would generate approximately 3,333 conformers out of a total of 10,000.

Release Notes

Replace the existing ligand sampling strategy with several conformer generation functions using RDKit.


stephaniewankowicz commented 2 months ago

Things to add: 1) README- provide instructions on how to run. Also suggest how user should generate SMILES. 2) Examples folder - provide instructions on how to run. 3) please provide a default for --numConf. 4) Put in a fail message if qFit ligand input argument does not contain SMILES string. 5) Delete old qFit ligand code (a lot of this is only commented out).

stephaniewankowicz commented 2 months ago

refinement script:

1) why do we need ligand only file? 2) Let's output: The output can be found at ${pdb_name}_qFit_ligand.(pdb|mtz|log). 3) remove: redistribute_cull_low_occupancies -occ 0.09 "${pdb_name}_002.pdb" mv -v "${pdb_name}_002_norm.pdb" "${pdb_name}_002.pdb" 4) rename final refinement -> Refinement (as this is the only refinement we are doing) 5) Make sure all of this goes to the appropriate params (you are refining with final_refine.params)

"refinement.input.xray_data.r_free_flags.label=${field}" >> ${pdb_name}_refine.params echo "refinement.input.xray_data.r_free_flags.generate=${gen_Rfree}" >> ${pdb_name}_refine.params

6) remove or put in params file (but this will duplicate lines if 5 is worked out): "refinement.input.xray_data.r_free_flags.generate=True" \ "refinement.input.xray_data.labels=${xray_data_labels}" \

stephaniewankowicz commented 2 months ago

In qFit.py:

This looks like it is hardcoded. If people need to have ligand.pdb specify that, but I think it should be able to read in any PDB as long as it is labeled correctly with args.

Read in ligand pdb file

self.ligand_pdb_file = "ligand.pdb"

put this print into logger info with details over which search it is. if mol.GetNumConformers() == 0: print("NO CONF GENERATED")