TUCAN-nest / TUCAN

A molecular identifier and descriptor for all domains of chemistry.
https://tucan-nest.github.io
GNU General Public License v3.0
22 stars 5 forks source link

Structure generation for C60 from TUCAN string by web application leads to error #92

Closed schatzsc closed 1 year ago

schatzsc commented 2 years ago

Trying to generate a structure from TUCAN string by https://tucan-nest.github.io leads to an error for C60:

Log message

21:16:48: An error occured during the conversion to Molfile. This might be due to an incorrect TUCAN string. Error: PythonError: Traceback (most recent call last): File "/lib/python3.10/site-packages/antlr4/Lexer.py", line 137, in nextToken ttype = self._interp.match(self._input, self._mode) File "/lib/python3.10/site-packages/antlr4/atn/LexerATNSimulator.py", line 104, in match return self.execATN(input, dfa.s0) File "/lib/python3.10/site-packages/antlr4/atn/LexerATNSimulator.py", line 195, in execATN return self.failOrAccept(self.prevAccept, input, s.configs, t) File "/lib/python3.10/site-packages/antlr4/atn/LexerATNSimulator.py", line 254, in failOrAccept raise LexerNoViableAltException(self.recog, input, self.startIndex, reach) antlr4.error.Errors.LexerNoViableAltException: LexerNoViableAltException(''')

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 30, in tucan_to_molfile File "/home/pyodide/tucan/parser/parser.py", line 24, in graph_from_tucan tree = parser.tucan() File "/home/pyodide/tucan/parser/tucanParser.py", line 1056, in tucan self.enterRule(localctx, 0, self.RULE_tucan) File "/lib/python3.10/site-packages/antlr4/Parser.py", line 374, in enterRule self._ctx.start = self._input.LT(1) File "/lib/python3.10/site-packages/antlr4/CommonTokenStream.py", line 62, in LT self.lazyInit() File "/lib/python3.10/site-packages/antlr4/BufferedTokenStream.py", line 187, in lazyInit self.setup() File "/lib/python3.10/site-packages/antlr4/BufferedTokenStream.py", line 190, in setup self.sync(0) File "/lib/python3.10/site-packages/antlr4/BufferedTokenStream.py", line 112, in sync fetched = self.fetch(n) File "/lib/python3.10/site-packages/antlr4/BufferedTokenStream.py", line 124, in fetch t = self.tokenSource.nextToken() File "/lib/python3.10/site-packages/antlr4/Lexer.py", line 139, in nextToken self.notifyListeners(e) # report error File "/lib/python3.10/site-packages/antlr4/Lexer.py", line 294, in notifyListeners listener.syntaxError(self, None, self._tokenStartLine, self._tokenStartColumn, msg, e) File "/lib/python3.10/site-packages/antlr4/error/ErrorListener.py", line 60, in syntaxError delegate.syntaxError(recognizer, offendingSymbol, line, column, msg, e) File "/home/pyodide/tucan/parser/parser.py", line 147, in syntaxError raise TucanParserException(error_str) tucan.parser.parser.TucanParserException: line 1:0 token recognition error at: ''' 'C60/(1-2)(1-3)(1-4)(2-5)(2-6)(3-7)(3-8)(4-9)(4-10)(5-7)(5-11)(6-12)(6-13)(7-14)(8-15)(8-16)(9-12)(9-17)(10-15)(10-18)(11-19)(11-20)(12-21)(13-19)(13-22)(14-23)(14-24)(15-25)(16-24)(16-26)(17-18)(17-27)(18-28)(19-29)(20-23)(20-30)(21-22)(21-31)(22-32)(23-33)(24-34)(25-26)(25-35)(26-36)(27-31)(27-37)(28-35)(28-38)(29-30)(29-39)(30-40)(31-41)(32-39)(32-42)(33-34)(33-43)(34-44)(35-45)(36-44)(36-46)(37-38)(37-47)(38-48)(39-49)(40-43)(40-50)(41-42)(41-47)(42-51)(43-52)(44-53)(45-46)(45-48)(46-54)(47-55)(48-56)(49-50)(49-51)(50-57)(51-58)(52-53)(52-57)(53-54)(54-59)(55-56)(55-58)(56-59)(57-60)(58-60)(59-60) ^

flange-ipb commented 2 years ago

The parser complains because the TUCAN string starts with the character '.

I could trim whitespaces, ' and " before entering the parser. On the other hand, this might suggest to the user that such a TUCAN string is correct. What do you think?

schatzsc commented 2 years ago

Sorry, I did not see the leading ' which is the leftover from some copy-and-paste. Just don't remember anymore from where.

Trimming leading and terminal blanks would be ok to me, since it is sometimes hard to see if there are any, but not "internal" ones, since the TUCAN string has to be continuous and an internal blank should be considered as invalid.

All other non-TUCAN characters or incorrect leading/terminal characters should generate an error.

flange-ipb commented 1 year ago

Whitespace characters and line terminators are now trimmed.