m2ms / fragalysis-frontend

The React, Redux frontend built by webpack
Other
1 stars 1 forks source link

Intermittent "Download Failed!" (staging/production) #1099

Open phraenquex opened 1 year ago

phraenquex commented 1 year ago

Alan is completely stumped - fails every time. (Staging fails less regularly.)

Needs promoting new logging stuff in backend code from staging to production - then troubleshooting.

alanbchristie commented 1 year ago

Downloads are failing intermittently on both "staging" and "production" stacks. The current production stack persistently fails downloading NCS1 structures.

Production suffers from crucial log not being emitted from the download task (a Celery task) so understanding why the download is not working is impossible.

Initial thoughts relating to memory have (so far) not worked. The staging stack is on an 8-core node and production is on a 124 core node atm. Why is this important? Because celery knows this and may allocate more internal memory in production than staging. This can be controlled (say set max number of celery tasks to 4) but it isn't AFAIK.

alanbchristie commented 1 year ago

Just caught the staging stack failing to download NCS1. It failed twice, but the 3rd time was successful. It may not be the same problem that we have in production but it is clear what's going wrong...

Failure no.1

2023-07-13T13:09:00+0000 create():2997 INFO # + DownloadStructures.post
2023-07-13T13:09:00+0000 create():2998 INFO # request.data={"target_name": "NCS1", "proteins": "", "event_info": false, "sigmaa_info": false, "diff_info": false, "trans_matrix_info": false, "NAN": false, "mtz_info": false, "cif_info": false, "NAN2": false, "map_info": false, "single_sdf_file": true, "sdf_info": false, "pdb_info": false, "bound_info": true, "metadata_info": true, "smiles_info": true, "static_link": false, "file_url": ""}
2023-07-13T13:09:00+0000 create():3051 INFO # Given target_name "NCS1"
2023-07-13T13:09:00+0000 create():3072 INFO # No proteins supplied
2023-07-13T13:09:00+0000 create():3087 INFO # Collected 0 proteins <QuerySet []>

Failure no.2

2023-07-13T13:09:11+0000 create():2997 INFO # + DownloadStructures.post
2023-07-13T13:09:11+0000 create():2998 INFO # request.data={"target_name": "NCS1", "proteins": "", "event_info": false, "sigmaa_info": false, "diff_info": false, "trans_matrix_info": false, "NAN": false, "mtz_info": false, "cif_info": false, "NAN2": false, "map_info": false, "single_sdf_file": true, "sdf_info": false, "pdb_info": false, "bound_info": true, "metadata_info": true, "smiles_info": true, "static_link": false, "file_url": ""}
2023-07-13T13:09:11+0000 create():3051 INFO # Given target_name "NCS1"
2023-07-13T13:09:11+0000 create():3072 INFO # No proteins supplied
2023-07-13T13:09:11+0000 create():3087 INFO # Collected 0 proteins <QuerySet []>

Success

2023-07-13T13:10:09+0000 create():2997 INFO # + DownloadStructures.post
2023-07-13T13:10:09+0000 create():2998 INFO # request.data={"target_name": "NCS1", "proteins": "", "event_info": false, "sigmaa_info": false, "diff_info": false, "trans_matrix_info": false, "NAN": false, "mtz_info": false, "cif_info": false, "NAN2": false, "map_info": false, "single_sdf_file": true, "sdf_info": false, "pdb_info": false, "bound_info": true, "metadata_info": true, "smiles_info": true, "static_link": false, "file_url": ""}
2023-07-13T13:10:09+0000 create():3051 INFO # Given target_name "NCS1"
2023-07-13T13:10:09+0000 create():3072 INFO # No proteins supplied
2023-07-13T13:10:09+0000 create():3087 INFO # Collected 129 proteins <QuerySet [{'id': 93264, 'code': 'NCS1-x0110_0A:Z1255459547', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0110_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0110_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0110_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0110_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93277, 'code': 'NCS1-x0128_0A:Z1272480091', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0128_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0128_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0128_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0128_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93265, 'code': 'NCS1-x0110_0B:Z1255459547', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0110_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0110_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0110_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0110_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93266, 'code': 'NCS1-x0110_1B:Z1255459547', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0110_1B_apo.pdb', 'bound_info': 'bound/NCS1-x0110_1B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0110_1B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0110_1B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93267, 'code': 'NCS1-x0110_2B:Z1255459547', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0110_2B_apo.pdb', 'bound_info': 'bound/NCS1-x0110_2B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0110_2B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0110_2B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93278, 'code': 'NCS1-x0128_0B:Z1272480091', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0128_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0128_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0128_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0128_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93268, 'code': 'NCS1-x0110_3B:Z1255459547', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0110_3B_apo.pdb', 'bound_info': 'bound/NCS1-x0110_3B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0110_3B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0110_3B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93269, 'code': 'NCS1-x0115_0A:Z1262549981', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0115_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0115_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0115_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0115_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93276, 'code': 'NCS1-x0125_0B:Z1269220427', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0125_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0125_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0125_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0125_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93270, 'code': 'NCS1-x0115_0B:Z1262549981', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0115_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0115_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0115_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0115_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93279, 'code': 'NCS1-x0132_0A:Z1273312142', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0132_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0132_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0132_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0132_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93271, 'code': 'NCS1-x0117_0B:Z1262628644', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0117_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0117_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0117_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0117_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93288, 'code': 'NCS1-x0149_0A:Z133716556', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0149_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0149_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0149_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0149_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93275, 'code': 'NCS1-x0125_0A:Z1269220427', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0125_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0125_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0125_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0125_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93262, 'code': 'NCS1-x0086_0A:Z1203730757', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0086_0A_apo.pdb', 'bound_info': 'bound/NCS1-x0086_0A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0086_0A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0086_0A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93272, 'code': 'NCS1-x0119_0B:Z1265813904', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0119_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0119_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0119_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0119_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93291, 'code': 'NCS1-x0149_1B:Z133716556', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0149_1B_apo.pdb', 'bound_info': 'bound/NCS1-x0149_1B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0149_1B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0149_1B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93285, 'code': 'NCS1-x0145_1A:Z1331830630', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0145_1A_apo.pdb', 'bound_info': 'bound/NCS1-x0145_1A_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0145_1A_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0145_1A_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93280, 'code': 'NCS1-x0132_0B:Z1273312142', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0132_0B_apo.pdb', 'bound_info': 'bound/NCS1-x0132_0B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0132_0B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0132_0B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, {'id': 93273, 'code': 'NCS1-x0119_1B:Z1265813904', 'target_id_id': 87, 'apo_holo': True, 'prot_type': 'AP', 'pdb_info': 'pdbs/NCS1-x0119_1B_apo.pdb', 'bound_info': 'bound/NCS1-x0119_1B_bound.pdb', 'cif_info': '', 'mtz_info': '', 'map_info': '', 'sigmaa_info': '', 'diff_info': '', 'event_info': '', 'trans_matrix_info': 'trans/NCS1-x0119_1B_transform.json', 'pdb_header_info': '', 'apo_desolve_info': 'pdbs/NCS1-x0119_1B_apo-desolv.pdb', 'aligned': None, 'aligned_to_id': None, 'has_eds': None}, '...(remaining elements truncated)...']>
2023-07-13T13:10:09+0000 check_download_links():711 INFO # + check_download_links(NCS1)
2023-07-13T13:10:09+0000 _create_structures_dict():559 INFO # + _create_structures_dict
2023-07-13T13:10:09+0000 _create_structures_dict():566 INFO # Expecting 129 molecules from 129 proteins
2023-07-13T13:10:09+0000 _create_structures_dict():618 INFO # 129 SD files collected
2023-07-13T13:10:09+0000 _create_structures_zip():455 INFO # + _create_structures_zip(NCS1)
2023-07-13T13:10:09+0000 _create_structures_zip():456 INFO # file_url="/code/media/downloads/2bcdc4c1-42b5-4c02-9524-7abc511f0023/NCS1.zip"
2023-07-13T13:10:09+0000 _create_structures_zip():457 INFO # single_sdf_file="True"
2023-07-13T13:10:09+0000 _create_structures_zip():458 INFO # sdf_files={'sdfs/NCS1-x0071_0B.sdf': 'NCS1-x0071_0B:Z1198177230', 'sdfs/NCS1-x0086_0A.sdf': 'NCS1-x0086_0A:Z1203730757', 'sdfs/NCS1-x0086_0B.sdf': 'NCS1-x0086_0B:Z1203730757', 'sdfs/NCS1-x0110_0A.sdf': 'NCS1-x0110_0A:Z1255459547', 'sdfs/NCS1-x0110_0B.sdf': 'NCS1-x0110_0B:Z1255459547', 'sdfs/NCS1-x0110_1B.sdf': 'NCS1-x0110_1B:Z1255459547', 'sdfs/NCS1-x0110_2B.sdf': 'NCS1-x0110_2B:Z1255459547', 'sdfs/NCS1-x0110_3B.sdf': 'NCS1-x0110_3B:Z1255459547', 'sdfs/NCS1-x0115_0A.sdf': 'NCS1-x0115_0A:Z1262549981', 'sdfs/NCS1-x0115_0B.sdf': 'NCS1-x0115_0B:Z1262549981', 'sdfs/NCS1-x0117_0B.sdf': 'NCS1-x0117_0B:Z1262628644', 'sdfs/NCS1-x0119_0B.sdf': 'NCS1-x0119_0B:Z1265813904', 'sdfs/NCS1-x0119_1B.sdf': 'NCS1-x0119_1B:Z1265813904', 'sdfs/NCS1-x0119_2B.sdf': 'NCS1-x0119_2B:Z1265813904', 'sdfs/NCS1-x0125_0A.sdf': 'NCS1-x0125_0A:Z1269220427', 'sdfs/NCS1-x0125_0B.sdf': 'NCS1-x0125_0B:Z1269220427', 'sdfs/NCS1-x0128_0A.sdf': 'NCS1-x0128_0A:Z1272480091', 'sdfs/NCS1-x0128_0B.sdf': 'NCS1-x0128_0B:Z1272480091', 'sdfs/NCS1-x0132_0A.sdf': 'NCS1-x0132_0A:Z1273312142', 'sdfs/NCS1-x0132_0B.sdf': 'NCS1-x0132_0B:Z1273312142', 'sdfs/NCS1-x0132_1A.sdf': 'NCS1-x0132_1A:Z1273312142', 'sdfs/NCS1-x0132_2A.sdf': 'NCS1-x0132_2A:Z1273312142', 'sdfs/NCS1-x0145_0A.sdf': 'NCS1-x0145_0A:Z1331830630', 'sdfs/NCS1-x0145_0B.sdf': 'NCS1-x0145_0B:Z1331830630', 'sdfs/NCS1-x0145_1A.sdf': 'NCS1-x0145_1A:Z1331830630', 'sdfs/NCS1-x0145_1B.sdf': 'NCS1-x0145_1B:Z1331830630', 'sdfs/NCS1-x0145_2B.sdf': 'NCS1-x0145_2B:Z1331830630', 'sdfs/NCS1-x0149_0A.sdf': 'NCS1-x0149_0A:Z133716556', 'sdfs/NCS1-x0149_0B.sdf': 'NCS1-x0149_0B:Z133716556', 'sdfs/NCS1-x0149_1A.sdf': 'NCS1-x0149_1A:Z133716556', 'sdfs/NCS1-x0149_1B.sdf': 'NCS1-x0149_1B:Z133716556', 'sdfs/NCS1-x0149_2B.sdf': 'NCS1-x0149_2B:Z133716556', 'sdfs/NCS1-x0178_0B.sdf': 'NCS1-x0178_0B:Z1497321453', 'sdfs/NCS1-x0181_0A.sdf': 'NCS1-x0181_0A:Z1509257513', 'sdfs/NCS1-x0181_0B.sdf': 'NCS1-x0181_0B:Z1509257513', 'sdfs/NCS1-x0181_1A.sdf': 'NCS1-x0181_1A:Z1509257513', 'sdfs/NCS1-x0181_1B.sdf': 'NCS1-x0181_1B:Z1509257513', 'sdfs/NCS1-x0196_0B.sdf': 'NCS1-x0196_0B:Z1601639621', 'sdfs/NCS1-x0208_0B.sdf': 'NCS1-x0208_0B:Z1689442171', 'sdfs/NCS1-x0222_0A.sdf': 'NCS1-x0222_0A:Z1796014543', 'sdfs/NCS1-x0222_0B.sdf': 'NCS1-x0222_0B:Z1796014543', 'sdfs/NCS1-x0238_0A.sdf': 'NCS1-x0238_0A:Z1929757385', 'sdfs/NCS1-x0238_0B.sdf': 'NCS1-x0238_0B:Z1929757385', 'sdfs/NCS1-x0238_1A.sdf': 'NCS1-x0238_1A:Z1929757385', 'sdfs/NCS1-x0252_0B.sdf': 'NCS1-x0252_0B:Z198195770', 'sdfs/NCS1-x0252_1B.sdf': 'NCS1-x0252_1B:Z198195770', 'sdfs/NCS1-x0252_2B.sdf': 'NCS1-x0252_2B:Z198195770', 'sdfs/NCS1-x0267_0A.sdf': 'NCS1-x0267_0A:Z212122838', 'sdfs/NCS1-x0267_0B.sdf': 'NCS1-x0267_0B:Z212122838', 'sdfs/NCS1-x0275_0A.sdf': 'NCS1-x0275_0A:Z220996120', 'sdfs/NCS1-x0275_0B.sdf': 'NCS1-x0275_0B:Z220996120', 'sdfs/NCS1-x0288_0A.sdf': 'NCS1-x0288_0A:Z235361235', 'sdfs/NCS1-x0288_0B.sdf': 'NCS1-x0288_0B:Z235361235', 'sdfs/NCS1-x0288_1B.sdf': 'NCS1-x0288_1B:Z235361235', 'sdfs/NCS1-x0292_0A.sdf': 'NCS1-x0292_0A:Z239127534', 'sdfs/NCS1-x0292_0B.sdf': 'NCS1-x0292_0B:Z239127534', 'sdfs/NCS1-x0292_1A.sdf': 'NCS1-x0292_1A:Z239127534', 'sdfs/NCS1-x0292_1B.sdf': 'NCS1-x0292_1B:Z239127534', 'sdfs/NCS1-x0292_2B.sdf': 'NCS1-x0292_2B:Z239127534', 'sdfs/NCS1-x0311_0A.sdf': 'NCS1-x0311_0A:Z26333448', 'sdfs/NCS1-x0311_0B.sdf': 'NCS1-x0311_0B:Z26333448', 'sdfs/NCS1-x0311_1A.sdf': 'NCS1-x0311_1A:Z26333448', 'sdfs/NCS1-x0342_0B.sdf': 'NCS1-x0342_0B:Z285141302', 'sdfs/NCS1-x0344_0B.sdf': 'NCS1-x0344_0B:Z285642082', 'sdfs/NCS1-x0355_0A.sdf': 'NCS1-x0355_0A:Z29692148', 'sdfs/NCS1-x0356_0A.sdf': 'NCS1-x0356_0A:Z300081412', 'sdfs/NCS1-x0356_0B.sdf': 'NCS1-x0356_0B:Z300081412', 'sdfs/NCS1-x0385_0A.sdf': 'NCS1-x0385_0A:Z31723394', 'sdfs/NCS1-x0385_0B.sdf': 'NCS1-x0385_0B:Z31723394', 'sdfs/NCS1-x0385_1A.sdf': 'NCS1-x0385_1A:Z31723394', 'sdfs/NCS1-x0385_1B.sdf': 'NCS1-x0385_1B:Z31723394', 'sdfs/NCS1-x0385_2B.sdf': 'NCS1-x0385_2B:Z31723394', 'sdfs/NCS1-x0385_3B.sdf': 'NCS1-x0385_3B:Z31723394', 'sdfs/NCS1-x0390_0B.sdf': 'NCS1-x0390_0B:Z32014663', 'sdfs/NCS1-x0393_0A.sdf': 'NCS1-x0393_0A:Z32327641', 'sdfs/NCS1-x0393_0B.sdf': 'NCS1-x0393_0B:Z32327641', 'sdfs/NCS1-x0393_1B.sdf': 'NCS1-x0393_1B:Z32327641', 'sdfs/NCS1-x0394_0B.sdf': 'NCS1-x0394_0B:Z32367954', 'sdfs/NCS1-x0409_0A.sdf': 'NCS1-x0409_0A:Z352296220', 'sdfs/NCS1-x0409_0B.sdf': 'NCS1-x0409_0B:Z352296220', 'sdfs/NCS1-x0412_0A.sdf': 'NCS1-x0412_0A:Z355202286', 'sdfs/NCS1-x0412_0B.sdf': 'NCS1-x0412_0B:Z355202286', 'sdfs/NCS1-x0412_1A.sdf': 'NCS1-x0412_1A:Z355202286', 'sdfs/NCS1-x0412_2A.sdf': 'NCS1-x0412_2A:Z355202286', 'sdfs/NCS1-x0469_0B.sdf': 'NCS1-x0469_0B:Z45527714', 'sdfs/NCS1-x0485_0A.sdf': 'NCS1-x0485_0A:Z53834613', 'sdfs/NCS1-x0488_0B.sdf': 'NCS1-x0488_0B:Z54226006', 'sdfs/NCS1-x0500_0A.sdf': 'NCS1-x0500_0A:Z55290386', 'sdfs/NCS1-x0500_0B.sdf': 'NCS1-x0500_0B:Z55290386', 'sdfs/NCS1-x0500_1A.sdf': 'NCS1-x0500_1A:Z55290386', 'sdfs/NCS1-x0503_0B.sdf': 'NCS1-x0503_0B:Z558511008', 'sdfs/NCS1-x0503_1B.sdf': 'NCS1-x0503_1B:Z558511008', 'sdfs/NCS1-x0503_2B.sdf': 'NCS1-x0503_2B:Z558511008', 'sdfs/NCS1-x0503_3B.sdf': 'NCS1-x0503_3B:Z558511008', 'sdfs/NCS1-x0562_0B.sdf': 'NCS1-x0562_0B:Z729352906', 'sdfs/NCS1-x0562_1B.sdf': 'NCS1-x0562_1B:Z729352906', 'sdfs/NCS1-x0562_2B.sdf': 'NCS1-x0562_2B:Z729352906', 'sdfs/NCS1-x0566_0B.sdf': 'NCS1-x0566_0B:Z740611958', 'sdfs/NCS1-x0566_1B.sdf': 'NCS1-x0566_1B:Z740611958', 'sdfs/NCS1-x0572_0B.sdf': 'NCS1-x0572_0B:Z758198920', 'sdfs/NCS1-x0573_0B.sdf': 'NCS1-x0573_0B:Z760031264', 'sdfs/NCS1-x0573_1B.sdf': 'NCS1-x0573_1B:Z760031264', 'sdfs/NCS1-x0580_0A.sdf': 'NCS1-x0580_0A:Z802824098', 'sdfs/NCS1-x0580_1A.sdf': 'NCS1-x0580_1A:Z802824098', 'sdfs/NCS1-x0581_0A.sdf': 'NCS1-x0581_0A:Z815437698', 'sdfs/NCS1-x0581_0B.sdf': 'NCS1-x0581_0B:Z815437698', 'sdfs/NCS1-x0592_0B.sdf': 'NCS1-x0592_0B:Z86416893', 'sdfs/NCS1-x0603_0A.sdf': 'NCS1-x0603_0A:Z975817026', 'sdfs/NCS1-x0603_0B.sdf': 'NCS1-x0603_0B:Z975817026', 'sdfs/NCS1-x0653_0A.sdf': 'NCS1-x0653_0A:POB0008', 'sdfs/NCS1-x0653_0B.sdf': 'NCS1-x0653_0B:POB0008', 'sdfs/NCS1-x0653_1A.sdf': 'NCS1-x0653_1A:POB0008', 'sdfs/NCS1-x0653_2A.sdf': 'NCS1-x0653_2A:POB0008', 'sdfs/NCS1-x0667_0A.sdf': 'NCS1-x0667_0A:POB0123', 'sdfs/NCS1-x0715_0A.sdf': 'NCS1-x0715_0A:Z166605460', 'sdfs/NCS1-x0715_0B.sdf': 'NCS1-x0715_0B:Z166605460', 'sdfs/NCS1-x0715_1B.sdf': 'NCS1-x0715_1B:Z166605460', 'sdfs/NCS1-x0729_0B.sdf': 'NCS1-x0729_0B:Z3241250482', 'sdfs/NCS1-x0759_0B.sdf': 'NCS1-x0759_0B:Z1333717631', 'sdfs/NCS1-x0780_0A.sdf': 'NCS1-x0780_0A:FDA2', 'sdfs/NCS1-x0780_0B.sdf': 'NCS1-x0780_0B:FDA2', 'sdfs/NCS1-x0827_0A.sdf': 'NCS1-x0827_0A:POB0029', 'sdfs/NCS1-x0827_0B.sdf': 'NCS1-x0827_0B:POB0029', 'sdfs/NCS1-x0914_0B.sdf': 'NCS1-x0914_0B:POB0063', 'sdfs/NCS1-x0916_0A.sdf': 'NCS1-x0916_0A:POB0101', 'sdfs/NCS1-x0916_0B.sdf': 'NCS1-x0916_0B:POB0101', 'sdfs/NCS1-x0929_0A.sdf': 'NCS1-x0929_0A:POB0164', 'sdfs/NCS1-x0929_0B.sdf': 'NCS1-x0929_0B:POB0164', 'sdfs/NCS1-x0929_1A.sdf': 'NCS1-x0929_1A:POB0164'}
2023-07-13T13:10:09+0000 _create_structures_zip():463 INFO # Creating download path (/code/media/downloads/2bcdc4c1-42b5-4c02-9524-7abc511f0023)
2023-07-13T13:10:10+0000 _create_structures_zip():478 INFO # combined_sdf_file=/code/media/downloads/2bcdc4c1-42b5-4c02-9524-7abc511f0023/NCS1_combined.sdf
2023-07-13T13:10:10+0000 _molecule_files_zip():294 INFO # len(molecules.sd_files)=129
2023-07-13T13:10:12+0000 _create_structures_zip():499 INFO # Adding combined_sdf_file "/code/media/downloads/2bcdc4c1-42b5-4c02-9524-7abc511f0023/NCS1_combined.sdf"...
2023-07-13T13:10:12+0000 _smiles_files_zip():325 INFO # Creating SMILES file "/code/media/downloads/2bcdc4c1-42b5-4c02-9524-7abc511f0023/smiles.smi"...
2023-07-13T13:10:12+0000 _smiles_files_zip():334 INFO # Added 56 SMILES
2023-07-13T13:10:12+0000 _extra_files_zip():353 INFO # Processing extra files (/code/media/targets/NCS1/extra_files)...
2023-07-13T13:10:12+0000 _extra_files_zip():365 INFO # Directory does not exist (/code/media/targets/NCS1/extra_files)...
2023-07-13T13:10:12+0000 _extra_files_zip():369 INFO # No extra files found
2023-07-13T13:10:12+0000 _document_file_zip():396 INFO # Creating documentation...
2023-07-13T13:10:13+0000 check_download_links():779 INFO # Link saved (/code/media/downloads/2bcdc4c1-42b5-4c02-9524-7abc511f0023/NCS1.zip)
2023-07-13T13:10:13+0000 list():2965 INFO # zip_file: True

Summary

The failures are caused by the DB not returning proteins for the query: -

        if len(proteins_list) > 0:
            proteins = []
            # Filter by protein codes
            for code_first_part in proteins_list:
                prot = Protein.objects.filter(code__contains=code_first_part).values()
                if prot.exists():
                    proteins.append(prot.first())
                else:
                    logger.warning('Could not find protein code "%s"', code_first_part)
        else:
            # If no protein codes supplied then return the complete list
            proteins = Protein.objects.filter(target_id=target.id).values()
        logger.info('Collected %s proteins %s', len(proteins), proteins)

Essentially, in my failures, the following code...

proteins = Protein.objects.filter(target_id=target.id).values()

...must be returning an empty list!

alanbchristie commented 1 year ago

A rework of the protein collection (fewer conditional tests in the logic) and enhanced logging have been deployed to staging in an attempt to identify the cause of the empty Protein list.

Sadly, since redeploying the new code I have not been able to reproduce the problem.

It might be best to simply merge these investigative changes to staging (they do offer minor logic improvements and better logging in the event of a failure) and then immediately promote the backend to production to either trap the fault or eliminate it.

alanbchristie commented 1 year ago

The latest backend code, with improved logging in the download logic, has now been deployed to Production (2023.07.1) but, as observed in Staging, the Download Failed! error downloading NCS1 is no longer happening.

So we have improved logging (hopefully), and now have to wait until the fail re-emerges.


Current status: Cannot reproduce the fault

phraenquex commented 1 year ago

So the existince of logging makes it work? Wouldn't be the first time that happens - in python scripts, I always assumed it's because the logging calls slow things down just enough to stop some kind of pile-up / race-condition thing.

Anyway, I'll take that as a win - @alanbchristie we can move this to the production lane, right? Which I hereby do.

alanbchristie commented 1 year ago

It might be the simple fact of restarting the stack? (but i did try that).

As well as introducting more logging I did reduce the number of conditional checks in the code. This should not have had any effect but there was some ambiguity about the state of one particular variable in the original code.

By reducing the number of if statements I've simplified the code but it shouldn't have have any effect on the execution (but it's not the first time I've found a bug in a compiler or interpreter, it's just that they are very rare). But with interactions with databases, who knows what effect a simple change can have?

@Waztom has suggested there are other failures - if someone knows of a failure in production or staging then I can inspect it using the (hopefully) improved log to understand why.

Nothing's fixed - but suggest we raise a new ticket and start a fresh investigation if it fails gain.

alanbchristie commented 1 year ago

Reported errors with XX01ZVNS2B also do not occur now.

I’m worried it’s a weird celery, memory, or networking/DB anomaly that, if true, may show up again as the stack continues to be used. The fact that it “eventually works” also points to something like this.

I wonder whether switching to a later Django will help? We’re on 3.1.14 and the latest 3.x release is 3.2.20 (a long-term supported release).

We're also using Python 3.7 (now unsupported), whereas Python 3.11 is the latest major release. Historically, the Python we use is tied to the base RDKit container the stack is built with.