When I have a molecule in training with a single character SMILES (such as "C" for methane), the encoder is failing with IndexError: list index out of range on the following:
def encode(self, mol_batch):
set_batch_nodeID(mol_batch, self.vocab)
root_batch = [mol_tree.nodes[0] for mol_tree in mol_batch]
because mol_tree.nodes is empty. From my understanding, I thought methane should be a single node graph. "C" is present in my vocabulary, and this also happens for other single character SMILES.
I removed these molecules from my training since they aren't helpful for my purposes anyway, but wasn't sure if this was a bug.
Hi there,
When I have a molecule in training with a single character SMILES (such as "C" for methane), the encoder is failing with
IndexError: list index out of range
on the following:because mol_tree.nodes is empty. From my understanding, I thought methane should be a single node graph. "C" is present in my vocabulary, and this also happens for other single character SMILES.
I removed these molecules from my training since they aren't helpful for my purposes anyway, but wasn't sure if this was a bug.