pharo-project / pharo

Pharo is a dynamic reflective pure object-oriented language supporting live programming inspired by Smalltalk.
http://pharo.org
Other
1.22k stars 356 forks source link

scanTokens: is bogus #5656

Closed Ducasse closed 4 years ago

Ducasse commented 4 years ago

Why scanTokens does not invoke scanToken and return literal tokens.

We should update the comments of this method and introduce a new one.

scanTokens: aStringOrStream | scanner | scanner := self on: aStringOrStream readStream. ^scanner contents collect:[ :t | t value ]

MarcusDenker commented 4 years ago

I think we tried to avoid having scanTokens: as it exposes a very low level aspect of the Parser: the fact that it uses tokens. It was an API of the Parser and thus exposed implementation details that you should not (imagine we would want to switch to a scanner less parser).

Then later it was added back to be backward compatible, but only to the scanner, where it is perfectly ok (as it is clear that you need that scanner if you want to use it)

We can improve the method as needed.

But I think the goal scanTokens: is not literals, it is supposed to return the string cut into pieces of what the scanner scans as tokens.

Ducasse commented 4 years ago

But we need to have access to RBTokens and not their value. In addition we should change the Parser so that we can give it a stream of RBToken instead of the piece of code.

MarcusDenker commented 4 years ago

But this would hard-code the fact that the parser uses Tokens. Do we want that?

guillep commented 4 years ago

Do we want that?

Yes, because we need it to implement another parser, but reusing the scanner. So yes.

MarcusDenker commented 4 years ago

What if that parser would be scannerless?

I just want to avoid to hard code the fact that there is a scanner at all.

guillep commented 4 years ago

That's a theoretical problem we don't have... What if we have a stackless VM?

guillep commented 4 years ago

I just want to avoid to hard code the fact that there is a scanner at all.

I think there are worse things hardcoded. I think we can document the fact this scanner works with Tokens. That's not a bad thing. We are not going to change the scanner soon. And in any case, if we change the scanner we have to change the parser too, so it does not change much.

MarcusDenker commented 4 years ago

But Scanners are often very much implementation details for the Parser. E.g. if we would use SmaCC for creating a Parser, it would come with it's own Scanner and it's own tokens.

guillep commented 4 years ago

Sure, but we could say the same of many things. Like: what about the coupling between slots and the IR? Isn't the IR an internal detail about the compiler? Why do slots or FFI use it?

To me this is just a matter of adding useful hooks. Now, I wont waste any more time in a discussion that to me does not make sense.

Why can't we do a separate parser using the same infrastructure? Yes, it was not originally made to do that. But the constraint posed is rather artificial.