Repeat soaks: Data analysis

nelse003 commented 6 years ago

[x] 1. Work out location of all repeat soak data -NUDT22 -NUDT7 --Covalent? -DCP2B

-- Use spreadsheet (zambezi/dropbox?) summarising the data?

[ ] 2. Write a script which runs exhaustive for all this data. Should be a metter of arranging correct parameters
Include a date/ version of code folder such that it can be compared to when run
[ ] 3. Port any required plotting code to the exhaustive search codebase
[ ] 4. Run refinement #2 after exhaustive search
[ ] 5. Generate spider plots of before and after exhaustive search?

nelse003 commented 6 years ago

Taking code from repeating exhaustive search:

The current exhaustive code requires the data to have been processed by pandda, specifically to have files which have occupancy groups separating the ground and bound states: Errors otherwise.

Note: That the repeating_exhaustive_search code is out of date, and will be combine into repeat_soaks.py

NUDT22a NUDT22A 133725a x0421, x1040 - x1059 NUDT22A 13663a x0391, x1009 to x1039 NUDT22A 13369a x0243, x977 to x1008 NUDT22A FMOPL000622a dsi poised: x0938 - x0977 NUDT22A FMOPL000622a DSPL: x0182, x0909-x0937

Location (pandda results): /dls/labxchem/data/2018/lb18145-55/processing/analysis/initial_model Location (copied atoms):

[ ] Run exhaustive on NUDT22 examples, straight from refine.mtz: this will only be on files that pandda utilizes
Note: Failures due to missing mtz etc are reported below.
[ ] Run refinement on non-copied atoms results from exhaustive search
[ ] Generate spider plots on non-copied atoms from exhaustive search
[ ] Check the copy atoms script for NUDT7 (OX210) and make more applicable across multiple structures
[ ] Run Copy atoms script on each NUDT22 repeated ligand.
[ ] Run exhaustive search on copied atoms
[ ] Run refinement on copied atoms results from exhaustive search
[ ] Generate spider plots on copied atoms from exhaustive search

NUDT7 (OX210: NUDT7 Copied atoms)

Location (copied atoms): /dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search_data/NUDT7_Copied_atoms

[ ] Run exhaustive on OX210 examples, from copied atoms
[ ] Run exhaustive search on copied atoms
[ ] Run refinement on non-copied atoms results from exhaustive search
[ ] Generate spider plots on non-copied atoms from exhaustive search
[ ] Consider whether worthwhile running on non copied atoms script

NUDT7 (NUOOOA0000181a)

[ ] Run exhaustive, straight from refine.mtz: this will only be on files that pandda utilizes
[ ] Run refinement on non-copied atoms results from exhaustive search
[ ] Generate spider plots on non-copied atoms from exhaustive search
[x] Run Copy atoms script for other NUDT7 repeated ligand.
[ ] Run exhaustive search on copied atoms
[ ] Run refinement on copied atoms results from exhaustive search
[ ] Generate spider plots on copied atoms from exhaustive search

DCP2B FMOPL00435a

[ ] Run exhaustive, straight from refine.mtz: this will only be on files that pandda utilizes
[ ] Run refinement on non-copied atoms results from exhaustive search
[ ] Generate spider plots on non-copied atoms from exhaustive search
[ ] Run Copy atoms script for DCP2B repeated ligand.
[ ] Run exhaustive search on copied atoms
[ ] Run refinement on copied atoms results from exhaustive search
[ ] Generate spider plots on copied atoms from exhaustive search

When Done:

[ ] Compare plots to those previously generated: Can be found dropbox/repeat_saoks

nelse003 commented 6 years ago

NUDT22A: Run on non-copied atoms for datasets x0977 to x1059.

Rejects according to:

    if refinement_xtals[0][0] is not None:
        params.input.mtz = refinement_xtals[0][0].encode('ascii')

        check_input_files(params)
        try:
            exhaustive(params)
        except UnboundLocalError:
            rejects.append(xtal_name)
            continue

    else:
        print("Refinement Mtz does not exist? for xtal: {}".format(xtal_name))
        rejects.append(xtal_name)
        continue

Either the mtz file doesn't exist, or ground/bound states can't be generated using occupancy groups (only known error caught by UnboundLocalError:

['NUDT22A-x0977', 'NUDT22A-x0979', 'NUDT22A-x0982', 'NUDT22A-x0983', 'NUDT22A-x0985', 'NUDT22A-x0990', 'NUDT22A-x0991', 'NUDT22A-x0992', 'NUDT22A-x0993', 'NUDT22A-x0995', 'NUDT22A-x0997', 'NUDT22A-x0999', 'NUDT22A-x1002', 'NUDT22A-x1004', 'NUDT22A-x1005', 'NUDT22A-x1008', 'NUDT22A-x1009', 'NUDT22A-x1010', 'NUDT22A-x1011', 'NUDT22A-x1012', 'NUDT22A-x1013', 'NUDT22A-x1014', 'NUDT22A-x1015', 'NUDT22A-x1016', 'NUDT22A-x1017', 'NUDT22A-x1018', 'NUDT22A-x1019', 'NUDT22A-x1020', 'NUDT22A-x1021', 'NUDT22A-x1022', 'NUDT22A-x1023', 'NUDT22A-x1024', 'NUDT22A-x1025', 'NUDT22A-x1026', 'NUDT22A-x1027', 'NUDT22A-x1028', 'NUDT22A-x1029', 'NUDT22A-x1030', 'NUDT22A-x1031', 'NUDT22A-x1032', 'NUDT22A-x1033', 'NUDT22A-x1034', 'NUDT22A-x1035', 'NUDT22A-x1036', 'NUDT22A-x1037', 'NUDT22A-x1038', 'NUDT22A-x1039', 'NUDT22A-x1040', 'NUDT22A-x1041', 'NUDT22A-x1042', 'NUDT22A-x1043', 'NUDT22A-x1044', 'NUDT22A-x1045', 'NUDT22A-x1046', 'NUDT22A-x1047', 'NUDT22A-x1048', 'NUDT22A-x1049', 'NUDT22A-x1050', 'NUDT22A-x1051', 'NUDT22A-x1052', 'NUDT22A-x1053', 'NUDT22A-x1054', 'NUDT22A-x1055', 'NUDT22A-x1056', 'NUDT22A-x1057', 'NUDT22A-x1058']

Thus only files x0978, x0980, x0981, x0984, x0986, x0987, x0988, x0989, x0994, x0996, , x0998, x1000, x1001, x1003, x1006, x1007 weren't rejected:

These all fall into the 13369a bracket and were re-exported in pandda. The failures do seem to correspond to where the export hasn't worked according to dropbox sheet.

Move onto looking at copying atoms, and make a plan from that?

nelse003 commented 6 years ago

All datasets for lower datasets have been rejected?

['NUDT22A-x0909', 'NUDT22A-x0910', 'NUDT22A-x0911', 'NUDT22A-x0912', 'NUDT22A-x0913', 'NUDT22A-x0914', 'NUDT22A-x0915', 'NUDT22A-x0916', 'NUDT22A-x0917', 'NUDT22A-x0918', 'NUDT22A-x0919', 'NUDT22A-x0920', 'NUDT22A-x0921', 'NUDT22A-x0922', 'NUDT22A-x0923', 'NUDT22A-x0924', 'NUDT22A-x0925', 'NUDT22A-x0926', 'NUDT22A-x0927', 'NUDT22A-x0928', 'NUDT22A-x0929', 'NUDT22A-x0930', 'NUDT22A-x0931', 'NUDT22A-x0932', 'NUDT22A-x0933', 'NUDT22A-x0934', 'NUDT22A-x0935', 'NUDT22A-x0936', 'NUDT22A-x0937', 'NUDT22A-x0938', 'NUDT22A-x0939', 'NUDT22A-x0940', 'NUDT22A-x0941', 'NUDT22A-x0942', 'NUDT22A-x0943', 'NUDT22A-x0944', 'NUDT22A-x0945', 'NUDT22A-x0946', 'NUDT22A-x0947', 'NUDT22A-x0948', 'NUDT22A-x0949', 'NUDT22A-x0950', 'NUDT22A-x0951', 'NUDT22A-x0952', 'NUDT22A-x0953', 'NUDT22A-x0954', 'NUDT22A-x0955', 'NUDT22A-x0956', 'NUDT22A-x0957', 'NUDT22A-x0958', 'NUDT22A-x0959', 'NUDT22A-x0960', 'NUDT22A-x0961', 'NUDT22A-x0962', 'NUDT22A-x0963', 'NUDT22A-x0964', 'NUDT22A-x0965', 'NUDT22A-x0966', 'NUDT22A-x0967', 'NUDT22A-x0968', 'NUDT22A-x0969', 'NUDT22A-x0970', 'NUDT22A-x0971', 'NUDT22A-x0972', 'NUDT22A-x0973', 'NUDT22A-x0974', 'NUDT22A-x0975', 'NUDT22A-x0976']

nelse003 commented 6 years ago

NUDT7 (NUOOOA0000181a):

Running exhaustive search on NUDT7 (NUOOOA0000181a) with copied atoms fails due to input files not being parsed by pandda:

Input: /dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search_data/NUDT7_covalent/NUDT7A-[x1903/refine_0001/output.pdb]

Failure:

No handlers could be found for logger "exhaustive.utils.select"
Traceback (most recent call last):
  File "repeat_soaks.py", line 206, in <module>
    exhaustive(params)
  File "/dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search/exhaustive/exhaustive.py", line 455, in run
    logger= logger)
  File "/dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search/exhaustive/exhaustive.py", line 184, in calculate_mean_fofc
    bound_states, ground_states = process_refined_pdb_bound_ground_states(pdb, params)
  File "/dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search/exhaustive/utils/select.py", line 492, in process_refined_pdb_bound_ground_states
    residue_altloc_dict = get_residue_altloc_dict(occupancy_groups)
  File "/dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search/exhaustive/utils/select.py", line 82, in get_residue_altloc_dict
    residues = get_parameter_from_occupancy_groups(occupancy_groups, "resseq")
  File "/dls/science/groups/i04-1/elliot-dev/Work/exhaustive_search/exhaustive/utils/select.py", line 119, in get_parameter_from_occupancy_groups
    raise Warning("Parameter may not be recognised,as output list is empty")
Warning: Parameter may not be recognised,as output list is empty

Is this fixable using quick refine?

nelse003 commented 6 years ago

NUDT7A covalent files present in /dls/labxchem/data/2017/lb18145-49/processing/analysis/initial_model/[NUDT7A-x1812/...] are problematic as the merging of pandda-input.pdb and pandda-model.pdb leads to a model with all altconfs as the files appear to be differently refined (atoms in different place)

Will need to try the delete atoms from the refinement with ligand instead, and then merge these?

nelse003 / exhaustive_search

Repeat soaks: Data analysis #79