thomas0809 / MolScribe

Robust Molecular Structure Recognition with Image-to-Graph Generation
MIT License
153 stars 28 forks source link

我发现后处理的一个BUG #16

Open Waterdrop-One opened 9 months ago

Waterdrop-One commented 9 months ago

Thank you for your MolScribe model. It is very powerful and has high accuracy. However, during testing, I discovered a BUG in the post-processing. screenshot-20240118-102735 In this example, when (CH2)5 replaces the R group, two bonds are detected around the R group [chemistry.py line 434], resulting in line 435 get_smiles_from_symbol(symbol, mol_w, atom, bonds) return '(=C([H]))C([H])C([H])C([H])C([H])' Two single bonds were merged into one double bond, causing the mol conversion to fail. I'm trying to fix this bug but I don't have a clue yet.

thomas0809 commented 9 months ago

Nice catch! We have tried to implement a postprocessing algorithm to cover common patterns of abbreviations, but I do think it is challenging to cover all cases. If you manage to design a more principled and robust method, I believe it would be a significant contribution.