fesch / Structorizer.Desktop

Structorizer is a little tool which you can use to create Nassi-Schneiderman Diagrams (NSD).
https://structorizer.fisch.lu
GNU General Public License v3.0
65 stars 20 forks source link

Permanent syntax tree representation in background #800

Open codemanyak opened 4 years ago

codemanyak commented 4 years ago

A consistent syntax tree representation of all element texts (or just the contained expressions) ought to be provided. This allows to concentrate on a clean and consistent syntax analysis, also in order to improve the performance. Parts of it are ready but need more elaboration. And it must be tested with regard to memory and time complexity. The repeated tokenizations and concatenations and the frantic search for a point where certain matching and replacements should ideally take place without spoiling all what had been transformed before or will have to be transformed thereafter could be avoided this way. If in the event there will be a central point in all generators where built-in functions are to be handled then this will be a big achievement allowing the requested clear documentation. (At least for a while...) Some of the benefits would be:

There are of course several challenges, too:

Originally posted by @codemanyak in https://github.com/fesch/Structorizer.Desktop/issues/462#issuecomment-343501610

This also relates to several internal issues.

codemanyak commented 7 months ago

Remark: Possibly it was not the best idea permanently to hold parsed lines on the elements, in particular since not all element text can be parsed and even small modifications in other elements or diagrams can invalidate any syntax tree derived on former diagram status. A very reasonable compromise seems to be to store the element text as lexically split token lists where the whitespace isn't mixed among the tokens but managed separately. This makes superfluous a lot of to-and-fro conversions, preserves original spacing without affecting token indices and it accelerates parsing a lot. On this occasion, the user-configurable "key phrases" (parser preferences) that may consist of several lexical tokens (like jusqu'à) can be represented by fix internal tokens, which make refactoring obsolete, since the internal key will always be the same, only on display and editing the user-specified keywords are to be shown. A parser preference modification will only require a drawing refresh (like with controller routine aliases). The task of Executor, Analyser and code generators will be facilitated a lot. They can make use of (ephemeral) syntax trees where it convenes. They may concentrate on the expressions embedded in the element text lines.