Open jeff-hykin opened 4 years ago
Maybe I'm missing something, but what breaks if __deep_clone__
correctly clones includes?
If _deep_clone__ correctly clones includes I don't think that will break anything, although there might be some performance overhead its probably negligible. I just think that'll be hard to do because cloning includes: [ :thing ]
will have to wait to resolve to a placeholder. Then a deep clone of the placeholder will need to be made, it will need to "remember" actions such as reTagging, then at compile time, resolve itself, deep clone whatever it resolved to, and then apply all the pending actions such as reTagging to the deep clone of the resolve. And that will need to be done recursively since the just-resolved placeholder might also have includes, all of which also need to be retagged
Ok, so the issue is that when using placeholders, reTag doesn't have a chance to retag the placeholders as they are not resolved?
This problem should be independent of includes: as g[:a] = Pattern.new("a").then(g[:b])
when retagged will have the same issue.
Yeah having it remember is definitely the correct result. The simplest solution would be for reTag to detect a placeholder and wrapping with a DeferredRetagPattern, that acts like a placeholder, and when asked to resolve, it first resolves its wrapped pattern, then it reTags the wrapped pattern.
More generally it might be a good idea, with the token system using a placeholder like system, to better integrate placeholders into the patterns. A pattern could provide a method to describe if it is unresolved, and provide a method to resolve itself.
Example change for reTag:
def reTag(args)
__deep_clone__.map! do |s|
+ if s.is_a?(PlaceholderPattern) || s.is_a?(DeferredRetagPattern)
+ next new DeferredRetagPattern(pattern: s, args: args)
+ end
# tags are keep unless `all: false` or `keep: false`, and append is not a string
discard_tag = (args[:all] == false || args[:keep] == false)
Edit: actually DeferredRetagPattern should probably inherit from placeholder as it effectively needs to be treated as one.
when it gets resolved it just returns resolve(@args[:pattern]).reTag(args)
With the other complications, like unit tests on placeholders and challenges I'm having with the old token system, I'm starting to think everything should be deferred. Fundamentally everything has to wait till the end to get group #'s assigned anyways, right now were basically trying to support two methodologies (dynamic and compiled) simultaneously.
A pattern could provide a method to describe if it is unresolved
I think that would be a really good idea. Basically placeholders are anything that is default-unresolved. Any pattern is also unresolved if it contains an unresolved pattern in it's pattern chain or in its includes
.
Some methods like to_tag
could check self.resolved? (and return an error if false)
So basically a deferred-everything system would make everything const before (and maybe even during) the resolution phase. Just save the arguments of PatternBase.new(), have calling/running unit tests be deferred, deep clones are deferred (since the includes/pattern-chain could contain placeholders), chaining uses deep clone and is therefore deferred, reTag's also use deep clone, and are then deferred. Then the resolution phase can do everything in the correct order: explore pattern chains with DFS to find pure (placeholder-less) patterns, basically recursively making perfect deep clones of every pattern, keeping track of patterns that include themselves (at any level), then checking repeats/groups/references, then performing all unit tests and linting.
I think its going to be a lot of work whatever way its done. reTagging patterns that include themselves at multiple levels will be challenging logic
keeping track of patterns that include themselves
The only way this gets resolved is with the recursive nature that the registry allows.
If a pattern includes itself there are 2 choices:
#name
#extracted-#{pattern.hash}
So to formalize the proposal.
A Pattern is Deferred if a pattern is not capable of executing #to_tag
, or #to_r
.
BasePattern
has three methods added to it.
BasePattern#deferred?
this returns if this pattern is deferred. By default, this returns the boolean or of all Pattern arguments #deferred?
method resultsBasePattern#deferred_regex?
some patterns may be able to generate a regex but not the tag, this allows for the use of the regex in some patterns without resolving. By default, this returns the boolean or of all Pattern arguments #deferred_regex?
method results, excluding arguments[:includes]
BasePattern#resolve(resolver)
- resolves deferred patterns using the specified resolver (usually a grammar). By default, this calls #resolve(resolver)
on all Pattern arguments.Any method that manipulates a pattern should, when encountering a deferred pattern, wrap the pattern in a wrapper that executes the method when resolve is called. This may be able to be a generic wrapper.
Undecided:
This is looking like a log of work, so I'm thinking for a 1.0 release lets just pull out reTag
, pull out word_cannot_be_any_of:
, pull out any kind of token-collection system, fix edgecase bugs, improve the error messages, and finalize the names of everything. Compared to the normal tmLanguage, this library already does so much, and its been unreleased for over a year. Getting a weak version of it into other people's hands would be an order of magnitude improvement for them.
We can make sure everything in 1.0 will be future compatible with the proposal being formed here
The situation:
includes
(maybe good/maybe bad)map
calls deep clone and thenmap!
so it can accidentally mutate the includeszeroOrMore
patterns utilizes the `includesreTag
kinda needs to edit the includes because ofzeroOrMore
I think this is resulting in some subtle bugs where retagging doesn't seem to retag because of the includes, and if it did retag the includes, it would be mutating the originals.
This isn't going to be fun to fix, especially considering some includes are placeholders which will basically have to be deep cloned and reTag-ed at compile time, so I'm just going to post it here for dealing with later
maybe an easier first solution is for repeated patterns like
zeroOrMore
to use a new key instead ofincludes
, and that new key issues a warning if it gets a placeholder