JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.06k stars 5.43k forks source link

Rename parse(::String) #24349

Closed KristofferC closed 6 years ago

KristofferC commented 6 years ago

I keep seeing new users (on Discourse) accidentally using e.g. parse("1") to go from a String to a literal. This is a few hundred times slower than parse(Int, "1") but in most cases, it works. I think that the function to parse source code should be named differently to avoid this. I also think it is ok if it has a slightly uglier name, e.g. parse_sourcecode just to be clear about what it does.


x-ref: see https://github.com/JuliaLang/julia/pull/24455 for the resolution.

yuyichao commented 6 years ago

Or Meta.parse. I really hope we don't keep making functions manipulating/inspecting source code/expressions longer and longer.......................

JeffBezanson commented 6 years ago

+1 for Meta.parse to go with Meta.lower.

Keno commented 6 years ago

parse(Expr, "1")?

yuyichao commented 6 years ago

parse(Expr, "1")?

A little wierd since it's not always an Expr....

StefanKarpinski commented 6 years ago

I really like the idea of parse(Expr, "1") but as it currently stands, it's pretty weird as @yuyichao says. Perhaps we could have a Code type that is just a union of all the types that parsing a Julia string can return and then call parse(Code, "123") for parsing Julia code. Code is a bad name, but you get the idea.

StefanKarpinski commented 6 years ago

@JeffBezanson prefers Meta.parse("123") which then moves it out of the immediate reach of people parsing data types which would be what the Base.parse function is for.

JeffBezanson commented 6 years ago

Or maybe Meta.parsecode ?

KristofferC commented 6 years ago

I like them having different names so you can e.g. import both.

StefanKarpinski commented 6 years ago

Let's just go with @JeffBezanson's suggestion.

ararslan commented 6 years ago

To clarify, Jeff's suggestion of Meta.parsecode

chakravala commented 6 years ago

Keep in mind that sometimes people parse processsed source code that becomes a number. For example, when arbitraty code is processed by a computer algebra system like in Reduce.jl, it might simplify an expression like e^(im*pi) into -1, since this is communicated in the pipe via string, it is necessary to parse the result, and then sometimes parse has to handle a number like parse("-1") and sometimes it has to handle an expression like parse("e^(im*pi)"), so it needs to remain flexible.

chakravala commented 6 years ago

To follow up on what I just said in previuos post:

@KristofferC if all you want is to make Julia users apply the functionality correctly, then perhaps it is not the syntax that needs updating but the documentation that tells people how to use it

@StefanKarpinski I like the idea of a Code-like type, but doesn't the Any already do this?

@JeffBezanson as I said above, perhaps the right move is to educate the users in the documentation instead of changing the syntax

personally, this idea of changing the parse syntax does not really resonate with me..

I am very happy with being able to do str |> this |> that |> parse and changing the syntax might make it more confusing to newcommers and also convolutes the syntax in modules

KristofferC commented 6 years ago

If you previously were using parse(T, str) then you don't change anything, if you used parse(str) you will instead do Meta.parsecode(str). This is just decoupling two things that are different operations, parsing julia source code into some julia type (most often Expr but also types corresponding to literals), and trying to convert a string (not source code) to a requested type.

chakravala commented 6 years ago

Yes, but it seems unnecessary to make give this function call a separate name, and now people won't even have it imported automatically. They are both the same functionality and it makes more sense to have them both named parse in my opinion, because it is the same functionality, just with different arguments. Now I will need to add import Meta.parsecode to my .juliarc.jl file for no reason.

Isn't it a standard feature that should be automatically imported in Julia?

It makes no sense to make people press so many extra keys in the terminal, if they just want to parse something into an expression. You now have to type 3x as many keys to get the same functionality out of the box

So yea, I still don't think this change should be made.

KristofferC commented 6 years ago

The point with this issue is that trying to parse strings into a certain type, and parsing julia source code into an AST are different operations so they probably shouldn't have the same name.

Your argument seems to only touch on:

  1. The length of the function name parsecode
  2. That parsecode will not be exported by default.

Regarding 1) I think the function name is not excesivvely long and 2) parsecode should rarely be used in user code and it seems totally fine to have to import it if you want to use it. In fact, making it a bit more hidden can arguably a good thing!

chakravala commented 6 years ago

This is a programming language, standard features should be accessible and not unnecessarily hidden away. the dump function isn't hidden away either..

I often use parse function in terminal for code experimentation to explore julia language, it definitely should not be hidden.

Isnt multiple dispatch what the julia language always talks about? why create a new function name for something that does essentially the same thing.

Simplicity should be the goal here. One single function name makes more sense than making more names, and if you see users who do not use it in a way you approve of, then try to educate them.

If anything, I would change the name of the parse(T, str) function to something different, and keep the parse(str) the same, since the parse(T, str) is more like a convert(T, str) than a parse.

In fact, why not just turn it into convert(T, str), since that's effectively what it does?

chakravala commented 6 years ago

Which one of these 2 makes more sense to you guys?

julia> parse(BigInt,"1")
1

julia> convert(BigInt,"1")
ERROR: MethodError: Cannot `convert` an object of type String to an object of type BigInt
This may have arisen from a call to the constructor BigInt(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] macro expansion at ./REPL.jl:97 [inlined]
 [2] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

Now that I consider it further, I think the parse(T,::String) function should really be renamed to convert, since that is missing at the moment, and it would belong to that name.

The parse(::String) function should remain as is, since it is actually for parsing, unlike the other function we are talking about, which is actually a conversion of string to an object.

JeffBezanson commented 6 years ago

Transforming a string to a more specific data type or structure is generally referred to as "parsing". parse is an appropriate term for converting a digit string to an integer.

In Julia, convert is reserved for moving between different representations of the same abstraction, like two different number types. A string is a sequence of characters, which is not the same kind of thing as a number, so we consider those too different for a convert method. A convert method would also make it easier to casually confuse numbers and their string representations.

However, personally I'd be fine with keeping parse for parsing code (but won't argue strongly for it); I do like the simplicity of it.

KristofferC commented 6 years ago

This is a programming language

Thanks, I wasn't aware.

I often use parse function in terminal for code experimentation to explore julia language, it definitely should not be hidden.

Me too! And no one will take that away from us.

why create a new function name for something that does essentially the same thing.

It doesn't, really.

The parse(::String) function should remain as is, since it is actually for parsing, unlike the other function we are talking about, which is actually a conversion of string to an object.

Thanks for voicing your opinion. It is good to know not everyone agrees with a change. I am sure you understand that, the change might still happen though.

chakravala commented 6 years ago

Sure, I am aware that my perspective doesn't carry much weight in this discussion, which is why it is necessary for me to express it strongly.

If the syntax is changed, at the very least, do not hide it under something like Meta. It's not like the parse(T, str) function should somehow have a higher privilege than the parse(str) funciton, because hiding it away would make it inaccessible in a out of the box Julia session.

It seems that @yuyichao and @JeffBezanson both agree that simplicity is important. Making the name much longer or hiding it away is unnecessary.

It's hard to argue that the 2 functions do very different things, when they both in fact parse a string into a Julia object. The only difference is that one of them is more general and the other allows for a specific target type. Singling out the general case and giving it a different syntax causes dissonance in the language and makes it less consistent and less natural / intuitive to use.

yuyichao commented 6 years ago

As much as I don't really want to keep having longer names for these functions your argument aren't entirely correct.

because hiding it away would make it inaccessible in a out of the box Julia session.

No, it's not hiding away. Meta is accessible.

The only difference is that one of them is more general and the other allows for a specific target type.

And this is wrong. parse is NOT more general. It does NOT do what parse(T, str) do with the correct type and not even a super type of it. Because of this, I do think parse is a slightly better candidate for moving into Meta or renaming than expand since AFAICT no one have ever misused expand for any other similarly named functions.

chakravala commented 6 years ago

Actually, expand is required to expand algebraic expressions, so it makes sense to leave that keyword available for packages to use, so you are incorrect on that aspect.

Then why not move the other parse(T,str) function into Meta too then, if that is such an appealing thing for developers of this language to do. Then you can have Meta.parse and Meta.parsecode, which at least is more consistent and less confusing than having one parse in meta and one parse in the main base scope.

yuyichao commented 6 years ago

Then why not move the other parse(T,str) function into Meta too then,

Because none of them are dealing with code.

Then you can have Meta.parse and Meta.parsecode, which at least is more consistent and less confusing than having one parse in meta and one parse in the main base scope.

And as I said the only reason that make this slightly better than renaming expand is that there is a real confusion (instead of an imaginary one as far as misuse from user) and that these two functions has no relation with each other and don't need any consistency.

chakravala commented 6 years ago

Alright then, I concede that this action is perhaps justifiable, although it still convolutes the syntax in a way that might not be necessary. I recommend thinking it over before making a decision on this, as there might be an even more harmonious syntax that makes the distinction clear while still being elegant.

chakravala commented 6 years ago

SUMMARY: parse should be a function that outputs AST, while convert should be a function that returns a desired type, this is really the distinction at the core of this issue.

Here is another argument for why parse makes sense as a single unified function. There are other packages like Maxima.jl and Reduce.jl and possibly others too that define their own parse functions to parse source code from other programming languages into Julia AST. Users of those packages will want to stick with the plain parse syntax for that functionality. If the parse syntax is made dissonant as proposed in this issue, it's going to make a lot of other functionality and syntax of the Julia ecosystem inconsistent, because the string parser will be defined in Meta and with a different name, while the Maxima and Reduce parsers will be defined just simply as parse.

Changing the parse syntax, it's just very dissonant and unelegant for the langauge on so many levels

Wish more people would see how unelegant making that change would be

Still, I think it would make more sense for convert(T, str) instead of parse(T, str), since the operation is more about conversion, and not about parsing a grammar or syntax..

So you see, if you change the name of the parse function, it directly affects how things are named for other packages that do different kinds of parsing, that all output AST in the end.

So, I think parse function name should be reserved for something that outputs AST, while convert should continue to be used for something that outputs a desired type. This way, no new syntax needs to be invented, and the distinction between parse(T,str) and parse(str) is maintained by renaming it to convert.

So this is what I think might be the most elegant and simple solution, in my humble opinion.

This way, parse is unified by its AST output, and convert is unified by its desired Type output

JeffBezanson commented 6 years ago

I understand that subjectively convert seems to describe this operation, but in julia that function has very particular semantics. If this convert method existed, then this would be allowed:

a = [1, 2, 3]
a[2] = "42"

We don't want to allow that because it's very slow and hides bugs where strings and numbers are confused.

As I said before, parse is a perfectly appropriate word for any operation that constructs an object from a character stream representation. It's not reserved for code ASTs. Furthermore, parsing of integers, dates, etc. is far more common in the world at large than parsing expressions to ASTs.

yuyichao commented 6 years ago

the string parser will be defined in Meta and with a different name, while the Maxima and Reduce parsers will be defined just simply as parse.

That seems to be a good argument for Meta.parse. I've even used JSON.parse as an example previously in a similar way to proof that the same verb parse has many different meanings in different context and there's no need to make them all the same functions. We have the Base parse having two meanings only because both of them are provided by Base and we didn't used to (and are still in the process of) have a stdlib that's not a single giant namespace. I hope the packages you mentioned doesn't export parse (if they do, it's a bug) so it's wrong to say they are just parse if you are talking about the fully qualified name of those functions.

It is crucial to realize the importance of namespacing. Packages are recommended to provide these functions that have domain specific meaning within its own namespace anyway to reduce conflict for the user and there's no need to reserve a name for only one of it's many meanings since there can be as many functions of the same name as you want.

Also, from how you describe the problem, there seems to be some misunderstanding of how things works. For a start, it's wrong to say parse syntax. It's not. It's a name and it's value can be assigned to anything or accessed any way. Putting it in a submodule does not affect it in any way at all. Because of this, what moving parse does not have any effect on the syntax.

chakravala commented 6 years ago

Alright, so parse(T, str) is the one that should stay the same and then you are proposing that the other kind of parse functions should be in their own namespace, like Meta.parse and Reduce.parse.

That would have a certain kind of consistency to it, so would that be something we can agree on?

My main issue is having some sort of consistency that works across more than just the Base julia language, the whole ecosystem needs to be able to have something consistent.

Also, the Reduce.jl parse function does not conflict with the Base parse function, so it is not really a bug, it simply extends it using multiple dispatch, since it does not accept a ::String but takes an ::RExpr as input argument instead.

Also, from how you describe the problem, there seems to be some misunderstanding of how things works. For a start, it's wrong to say parse syntax. It's not.

Well, I am formally studying pure mathematics and not computer science so I wasn't aware you have your own definition, but if you just go by the linguistic definition of syntax,

the way in which linguistic elements (such as words) are put together to form constituents (such as phrases or clauses)

That's the definition of syntax I am using here.

morningkyle commented 6 years ago

I really like the agreement of keep one parse, either in different namespaces as suggested or even in the same namespace.

Even if parse(“1”) is hundreds of times slower than a “smart” parse(Int, “1”) call, it might not be a bad choice in real cases. Because some people need Julia's convenience as other scripts, and some people need the speed as C. And they might even be the same person, just in different use cases. I think deciding a better choice for users by introducing a new function name does not help much here, because: (1) Two different names for a similar concept (parsing strings) breaks the spirit of multi-dispatch. (2) The other less common name (eg. parse_sourcecode/parsecode/) is a pain to remember and hard to discover to new users. And it also breaks the convenience/simplicity of multi-dispatch. (Another example is the max/maximum names, too subtle to make a difference without studying the background first.)

So one parsein different namespaces is a good trade-off/improvement to me, if the trade-off/improvement is needed.

KristofferC commented 6 years ago

I like Meta.parse more than parsecode as well. The drawback is that you can't really import it but Meta is pretty short.

yuyichao commented 6 years ago

const import_name_for_Meta_parse = Meta.parse.

StefanKarpinski commented 6 years ago

I think I lean slightly the other way towards parsecode since it says what it does: it parses code.

KristofferC commented 6 years ago

Drawing a parallel to JSON.parse, even though it parses JSON, parsejson is perhaps not needed, and namespacing with the module name is enough?

StefanKarpinski commented 6 years ago

Yeah, that's a fair point, it does parallel the existence of other parse functions elsewhere.

chakravala commented 6 years ago

Why is it that Meta does not allow you to import?

KristofferC commented 6 years ago
julia> module M
           parse(x::String) = x
       end
M

julia> parse("1")
1

julia> import M.parse
WARNING: ignoring conflicting import of M.parse into Main
chakravala commented 6 years ago

That example only has the conflict because parse(::String) has not been moved to Meta yet, or am I overlooking something? Once it is moved, it will no longer in Main, so it can then be imported

KristofferC commented 6 years ago

parse(T, str) will still exist.

KristofferC commented 6 years ago
julia> module M
           parse(I, guess, we, are, lucky, that, _I, _do, not, conflict, with, anything) = "Or do I???"
       end
M

julia> parse("1")
1

julia> import M.parse
WARNING: ignoring conflicting import of M.parse into Main

Instead of just pulling out facts from nowhere, please take the effort to check what you are saying, especially when it would take you 15 seconds in the REPL.

yuyichao commented 6 years ago

Not doing that is the whole point.

KristofferC commented 6 years ago

Well, now you have defined parse(::String) for everyone again which not to do is the point of the issue........

chakravala commented 6 years ago

Then I suppose there is no way of importing a function within a module without exporting it? Very well then, now the issue is more clear to me, I figured there could be a way around it.

yuyichao commented 6 years ago

Then I suppose there is no way of importing a function within a module without exporting it?

That's totally unrelated and is not the case.

yuyichao commented 6 years ago

And it's only impossible to extend a function without extending it. The name conflict also shouldn't be a big issue, you just need https://github.com/JuliaLang/julia/issues/24349#issuecomment-341447294 before we have https://github.com/JuliaLang/julia/issues/1255

KristofferC commented 6 years ago

Triage label to discuss the name a bit further.

The two competing options are:

StefanKarpinski commented 6 years ago

Triage says Meta.parse.

chakravala commented 6 years ago

And it's only impossible to extend a function without extending it.

Consider this: within the scope of a module Meta do a local import Base.parse that would have the effect of importing parse(T, str) only into the local module namespace, without having the side-effect of exporting this module's parse . then define parse(::String) in the Meta namespace.

Then, if you did import Meta.parse in Main, it would not cause a conflict.

The pre-requesite is having a local import ability that does not automatically export the name into the Main outside of the module. So, theoretically it is feasible.

yuyichao commented 6 years ago

No that's not how anything works in julia. Unlike C++, functions/methods in julia are not a bundle of unrelated definitions that happens to have the same name, when you define a method, you are aways adding that method to a specific object.

All imports are by definition local since that import only affect the namespace where the import statement is in (i.e. current module) but when you are extending a function you are extending that object you imported and it is visible to all users of that object. In another word, what you are saying is theoritically impossible.

chakravala commented 6 years ago

In another word, what you are saying is theoritically impossible.

It only remains impossible until management of the local import namespace is implemented. So not entirely impossible, just not currently possible due to what's missing.

All imports are by definition local since that import only affect the namespace where the import statement is in (i.e. current module) but when you are extending a function you are extending that object you imported and it is visible to all users of that object.

This contradicts itself. The import is not currently local, it seems to be global or not strictly local, since the function that is extended is visible globally or in a larger scope.

yuyichao commented 6 years ago

Your definition of local is wrong. Extending a method is totally independent of namespace.