szu-ljh2020 / MARS

5 stars 1 forks source link

Code is not working #1

Closed thegodone closed 2 months ago

thegodone commented 4 months ago

Can you provide a full environment setup please ?

I notice several errors from rdkit that break the code, this is during prepare_data step using ubuntu linux:

I fix few issue to make it run until the pickle on linux see https://github.com/thegodone/MARS

  0%|                                                                                                                                                                                                                                                                    | 0/4983 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "prepare_mol_graph.py", line 748, in <module>
    dataset_test.encode_transformation(dataset_train.motif_vocab)
  File "prepare_mol_graph.py", line 668, in encode_transformation
    gnn_data, gnn_data_synthon = self.get(idx)
  File "prepare_mol_graph.py", line 393, in get
    precessed_rxn = pickle.load(f)
_pickle.UnpicklingError: state is not a dictionary

Another frequent error is kekulize: why we need to kekulize ?

38931it [10:45, 63.12it/s][19:11:20] Can't kekulize mol.  Unkekulized atoms: 8 9
can kekule after align_kekule_pairs, skip

Using arm64 M3 mac, I have another error:

39835 19 4876 121
max_attachments in encode_transformation: 4
 41%|█████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                                                                                                 | 2043/4983 [00:01<00:01, 1698.95it/s]motif_vocab does not have motif [Cl:1][C:2]1=[C:3][C:4]=[C:1005][C:6]=[C:7]1
 44%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                                                                                         | 2215/4983 [00:01<00:01, 1703.10it/s]motif_vocab does not have motif [N:2]1([C:1001])[C:3][C:4][C:5][C:6][C:7]1
 48%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                                                                                | 2386/4983 [00:01<00:01, 1698.70it/s]motif_vocab does not have motif [Br:2][S:1001]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4983/4983 [00:02<00:00, 1697.12it/s]
max_attachments in encode_transformation: 4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39854/39854 [00:23<00:00, 1692.99it/s]
Traceback (most recent call last):
  File "prepare_mol_graph.py", line 748, in <module>
    dataset_test.encode_transformation(dataset_train.motif_vocab)
  File "prepare_mol_graph.py", line 667, in encode_transformation
    for idx, pfn in enumerate(tqdm(self.processed_file_names)):
  File "prepare_mol_graph.py", line 409, in processed_file_names
    return self.process_data_files
AttributeError: 'MoleculeDataset' object has no attribute 'process_data_files'
(rdkit202031) tgg@macbook-pro src % Traceback (most recent call last):
  File "prepare_mol_graph.py", line 748, in <module>
    dataset_test.encode_transformation(dataset_train.motif_vocab)
  File "prepare_mol_graph.py", line 667, in encode_transformation
    for idx, pfn in enumerate(tqdm(self.processed_file_names)):
  File "prepare_mol_graph.py", line 409, in processed_file_names
    return self.process_data_files
thegodone commented 2 months ago

found a solution