coleygroup / rdcanon

SMARTS sanitization
MIT License
22 stars 4 forks source link

Missing primitives matches in `parse_smarts_total` / Lark parser #1

Closed tduigou closed 1 month ago

tduigou commented 4 months ago

Hello,

To my understanding, some atoms are missing — such as "At" and "Ce" — within the list of PRIMITIVE or PRIMITIVE_SINGLE defined for the Lark parser (in rdcanon/token_parser.py). It looks to me a solution would be to define all existing atoms within the list of Lark primitive.

Here are instructions to reproduce the bug:

import rdcanon
rdcanon.canon_smarts("[At]")

Here is the (long) traceback:

---------------------------------------------------------------------------
UnexpectedCharacters                      Traceback (most recent call last)
File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:673, in ContextualLexer.lex(self, lexer_state, parser_state)
    [672](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:672) last_token = lexer_state.last_token  # Save last_token. Calling root_lexer.next_token will change this to the wrong token
--> [673](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:673) token = self.root_lexer.next_token(lexer_state, parser_state)
    [674](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:674) raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:598, in BasicLexer.next_token(self, lex_state, parser_state)
    [597](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:597)         allowed = {"<END-OF-FILE>"}
--> [598](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:598)     raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
    [599](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:599)                                allowed=allowed, token_history=lex_state.last_token and [lex_state.last_token],
    [600](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:600)                                state=parser_state, terminals_by_name=self.terminals_by_name)
    [602](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:602) value, type_ = res

UnexpectedCharacters: No terminal matches 't' in the current parser context, at line 1 col 6

[$([At])]
     ^
Expected one of: 
    * BOND_PRIMITIVE
    * "$("
    * INTEGER
    * NOT
    * RPAR
    * LPAR
    * OPERATOR_PRIMITIVE
    * LSQB
    * PRIMITIVE
    * RSQB

Previous tokens: Token('PRIMITIVE', 'A')

During handling of the above exception, another exception occurred:

UnexpectedCharacters                      Traceback (most recent call last)
Cell In[3], [line 1](vscode-notebook-cell:?execution_count=3&line=1)
----> [1](vscode-notebook-cell:?execution_count=3&line=1) rdcanon.canon_smarts("[At]")

File ~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1324, in canon_smarts(smarts, mapping, embedding, return_score, v, repl_dict)
   [1308](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1308) """
   [1309](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1309) Canonicalizes a SMARTS pattern.
   [1310](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1310) 
   (...)
   [1321](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1321)     the top score, and the unmapped canonical SMARTS pattern is returned.
   [1322](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1322) """
   [1323](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1323) g = Graph(v)
-> [1324](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1324) g.graph_from_smarts(smarts, embedding)
   [1325](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1325) out = g.recreate_molecule(mapping)
   [1327](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:1327) for k in repl_dict:

File ~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:105, in Graph.graph_from_smarts(self, smarts, embedding)
    [101](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:101)     print()
    [102](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:102)     print("token embeddings")
--> [105](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:105) atoms_seq, bonds_seq = parse_smarts_total(smarts)
    [106](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:106) # print(atoms_seq)
    [107](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:107) # print(bonds_seq)
    [108](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:108) # print(len(atoms_seq), len(mol.GetAtoms()))
    [109](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:109) # print()
    [111](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:111) for atom in mol.GetAtoms():
    [112](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:112) 
    [113](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:113)     # if atom.GetChiralTag() != Chem.rdchem.ChiralType.CHI_UNSPECIFIED:
   (...)
    [117](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:117)         
    [118](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/main.py:118)     # is it supposed to make sense?

File ~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/token_parser.py:1407, in parse_smarts_total(in_smarts)
   [1405](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/token_parser.py:1405) def parse_smarts_total(in_smarts):
   [1406](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/token_parser.py:1406)     # not used currently
-> [1407](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/token_parser.py:1407)     parsed = parser.parse("[$(" + in_smarts + ")]")
   [1408](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/token_parser.py:1408)     atoms_seq, bonds_seq = transformer2.transform(parsed)
   [1409](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/projects/2024__RetroSynthesis/signature/lib/rdcanon/rdcanon/token_parser.py:1409)     return atoms_seq, bonds_seq

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:658, in Lark.parse(self, text, start, on_error)
    [640](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:640) def parse(self, text: str, start: Optional[str]=None, on_error: 'Optional[Callable[[UnexpectedInput], bool]]'=None) -> 'ParseTree':
    [641](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:641)     """Parse the given text, according to the options provided.
    [642](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:642) 
    [643](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:643)     Parameters:
   (...)
    [656](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:656) 
    [657](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:657)     """
--> [658](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lark.py:658)     return self.parser.parse(text, start=start, on_error=on_error)

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parser_frontends.py:104, in ParsingFrontend.parse(self, text, start, on_error)
    [102](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parser_frontends.py:102) kw = {} if on_error is None else {'on_error': on_error}
    [103](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parser_frontends.py:103) stream = self._make_lexer_thread(text)
--> [104](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parser_frontends.py:104) return self.parser.parse(stream, chosen_start, **kw)

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:42, in LALR_Parser.parse(self, lexer, start, on_error)
     [40](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:40) def parse(self, lexer, start, on_error=None):
     [41](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:41)     try:
---> [42](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:42)         return self.parser.parse(lexer, start)
     [43](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:43)     except UnexpectedInput as e:
     [44](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:44)         if on_error is None:

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:88, in _Parser.parse(self, lexer, start, value_stack, state_stack, start_interactive)
     [86](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:86) if start_interactive:
     [87](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:87)     return InteractiveParser(self, parser_state, parser_state.lexer)
---> [88](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:88) return self.parse_from_state(parser_state)

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:111, in _Parser.parse_from_state(self, state, last_token)
    [109](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:109)     except NameError:
    [110](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:110)         pass
--> [111](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:111)     raise e
    [112](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:112) except Exception as e:
    [113](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:113)     if self.debug:

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:100, in _Parser.parse_from_state(self, state, last_token)
     [98](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:98) try:
     [99](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:99)     token = last_token
--> [100](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:100)     for token in state.lexer.lex(state):
    [101](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:101)         assert token is not None
    [102](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/parsers/lalr_parser.py:102)         state.feed_token(token)

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:676, in ContextualLexer.lex(self, lexer_state, parser_state)
    [674](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:674)     raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)
    [675](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:675) except UnexpectedCharacters:
--> [676](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:676)     raise e

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:665, in ContextualLexer.lex(self, lexer_state, parser_state)
    [663](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:663)     while True:
    [664](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:664)         lexer = self.lexers[parser_state.position]
--> [665](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:665)         yield lexer.next_token(lexer_state, parser_state)
    [666](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:666) except EOFError:
    [667](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:667)     pass

File ~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:598, in BasicLexer.next_token(self, lex_state, parser_state)
    [596](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:596)     if not allowed:
    [597](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:597)         allowed = {"<END-OF-FILE>"}
--> [598](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:598)     raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
    [599](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:599)                                allowed=allowed, token_history=lex_state.last_token and [lex_state.last_token],
    [600](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:600)                                state=parser_state, terminals_by_name=self.terminals_by_name)
    [602](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:602) value, type_ = res
    [604](https://file+.vscode-resource.vscode-cdn.net/Users/tduigou/projects/2024__RetroSynthesis/signature/dev/~/miniconda3/envs/sig/lib/python3.12/site-packages/lark/lexer.py:604) ignored = type_ in self.ignore_types

UnexpectedCharacters: No terminal matches 't' in the current parser context, at line 1 col 6

[$([At])]
     ^
Expected one of: 
    * BOND_PRIMITIVE
    * "$("
    * INTEGER
    * NOT
    * RPAR
    * LPAR
    * OPERATOR_PRIMITIVE
    * LSQB
    * PRIMITIVE
    * RSQB

Previous tokens: Token('PRIMITIVE', 'A')

Thanks for you work!

b-mahjour commented 1 month ago

I just updated the grammar to include most of the remaining atom types, hopefully I got the entire periodic table but feel free to reopen this issue if we are still missing primitives.

It's a bit unwieldy, but one can always add more tokens to the grammar in https://github.com/coleygroup/rdcanon/blob/main/rdcanon/token_parser.py, if you don't want to wait for me to get around to it.