Open mikapfl opened 2 years ago
I'm not quite sure, where to go from here. I see the following solutions:
/
in comments in GAMS. We should probably add some checks to codeCheck
or so to enforce this. This can be a template for similar, future problems. If stuff like this crops up because GAMS features are used that our regex-based approach is not equipped to handle, we forbid them and enforce that via codeCheck
(since codeCheck
is also based on regex, this might sometimes be a bit difficult, but in general, it is easier to detect stuff that you don't want to allow than properly parsing it).manipulateConfig
. R is a turing-complete language, it is not difficult to track if you are in one or multiple context when parsing a GAMS file line-by-line in R. This can also be a template for similar, future problems. If something gets too difficult in regex, we instead convert to line-by-line parsing in R, which is much more expressive.Sorry to make a big thing out of this again, but given the amount of work we have lately (not only connected to rewriting the start scripts) which revolves around parsing GAMS code with regex, I can't help but think we are busy fixing an ultimately unnecessary approach given the other, saner, options. Sure, GAMSTRANSFER doesn't support $setglobal
, but if we can reduce the problem that needs to be solved with regexes to the relatively simple $setglobal
instead of parsing set and parameter definitions, I would already be quite happy.
/
in quoted comments here: https://github.com/pik-piam/lucode2/pull/89
If given something like this:
and tasked with changing
cm_FlexTaxFeedback
to 1,manipulateConfig
will detect the/
in the comment inside the parameter definition and will change everything between the slashes so the file ends up as:i.e. swallowing a good part of the comments.
Within the current framework of
manipulateConfig
this is very hard to solve. manipulateConfig uses regular expressions, which are best for context-free grammars, but to properly detect what is happening, we have to detect two contexts - first, the definition context, which in the example goes fromparameter
to the first;
, but can also span multiple definitions, and then, second, the comment context within the definition context, which spans from***
to the next end of line. While I think it should be theoretically possible to solve this in perl-compatible regular expressions (they are turing-complete, after all), practically, the regular expressions are completely unreadable already, and adding another context detection to them is practically impossible. For the record, this is the regex which detects parameter definitions currently, which does not handle comments properly: