Closed Earlopain closed 5 months ago
Right. So the location of any child node is contained is the location of its parent node... except for HEREDOCs. That's the only exception and it makes thinking about code locations a bit strange.
You might be thinking that the following lines should also be part of the source / locations, but you must note that HEREDOCs are very odd, they are not necessarily contiguous, even in terms of lines, e.g.:
foo(<<~'FOO', bar(<<~'TEXT'))
This is FOO
FOO
Hello World!
TEXT
As far as bar(...)
is concerned, the content of the HEREDOC starts 2 lines later!
The implementation of multiline?
simply looks that the location of parent node, and it does start and end on the same line. It's just that its argument spills over to the next lines.
I wouldn't be opposed to adding an argument to multiline?
so it checks the children, maybe like multiline?(including_children: true)
?
As for the rest, what other results did you have in mind?
Yeah, heredocs are always a nice bag of surprises. I don't even want to think about the case you wrote and how to handle that (:
For my initial case, how would I fetch the full block content? I tried a bunch of things but only managed to get to it through its parent, which then also includes other content I don't want.
I would expect to get the 3 lines back without much fuss since all the wonderful things you can do with heredocs don't apply to that case (I think so at least). Having to special-case this every time you want to get a blocks source doesn't sound fun. block.source
is my main use-case, I only added the other printouts to show a bit more.
A bit of context, in rubocop-rspec
the goal of the cop is simply to merge the contents of two blocks into one
One solution would be to not do the auto-correction if any of the descendents involved is a HEREDOC... 😅
If you want to handle it nevertheless, I think (I am not 100% sure though) that a node is always it's location
+ potentially a single range encompassing all its HEREDOCs (potentially disconnected as shown above). Even with nested HEREDOCs, I think they would always have to be contiguous.
If that's so, we could add a function Node#heredoc_location
that would return the location for all the HEREDOCs of a given Node's descendents.
This would still make things a big challenging for you, as you'd have to check if any of the heredocs have the same delimiter...
Yeah, not offering autocorrect may be the simplest solution.
Though I don't really understand, why would the heredoc not be considered part of the block itself? I'm not looking at individual nodes, just the over encompassing block, which to me logically also includes the heredocs.
I would have to check later when I got some time but I feel like there are a bunch of other cops for which this would also be a problem.
Let me repeat that in the general case a node's source code is not continuous if there are heredocs involved.
I'm assuming that by "block", you mean a block
Node like in your example above. There's nothing special about a block
node, so it's easy to construct an example where the whole source code for it is disconnected (i.e. the main "expression part" and the "heredoc part"):
foo {<<~'TEXT'} + 'not part of the block'
This is the TEXT
TEXT
Or even disconnected by lines:
<<~'FOO' + foo {<<~'TEXT'}
This is FOO
FOO
Hello World!
TEXT
What would you do in those cases?
So, as far as rubocop-ast
is concerned, we could consider an API that returns the "full location" range, which could return an array; either the "expression part" + "heredoc part" , or a "full expression part" in case these are contuguous
Now that I look more closely at the example you provided... I'm not sure why you don't use the block's location:
> pp source.ast.location
#<Parser::Source::Map::Collection:0x000000011de2c278
@begin=#<Parser::Source::Range (string) 7...9>,
@end=#<Parser::Source::Range (string) 51...54>,
@expression=#<Parser::Source::Range (string) 0...54>,
If you take the expressions' location, and remove the begin
and end
's location, you'll get the full range of what you want.
What an awful example, I see now what you are talking about. It took me a moment to even discern where the block is. I'm not sure if the api you are talking about would makes sense since you would need to check if this happens anyways. It probably (hopefully) isn't every common code but you'd just push the need for a check to some other place.
I myself would qualify this as cursed syntax. Always learning something new.
If you take the expressions' location, and remove the begin and end's location, you'll get the full range of what you want.
Oops. I'm not really sure, this just didn't cross my mind. It does sound pretty much like what I was trying to do.
Thank you for talking with me about this, I would not have gotten this otherwise. I see now that there is not much that rubocop-ast
can do here.
Sure thing.
It's really the concept of "a node corresponds to a continuous range of characters" that is tricky. Corrolary is that a blocks' location minus begin
and end
correspond more or less to its body's location. These are true "all the time",... except when they aren't.
Good luck!
I ran into https://github.com/rubocop/rubocop-rspec/issues/1913 and tried to work on a fix. But there is something happening that makes this rather hard:
Does this look like expected behaviour? To me the last 4 printouts seem wrong but it's possible I'm missing something.