JacquesCarette / Drasil

Generate all the things (focusing on research software)
https://jacquescarette.github.io/Drasil
BSD 2-Clause "Simplified" License
142 stars 26 forks source link

Change constDecDef implementation in Python #2179

Closed bmaclach closed 3 months ago

bmaclach commented 4 years ago

Currently constDecDef, GOOL's function for generating code for declaring and defining a constant, is implemented in Python by rendering just a plain variable declaration, since the concept of "constants" isn't actually supported by Python.

Since constDecDef is not actually doing what it is seems like it should in Python, we may want to change the implementation. We could:

  1. Have the Python instance of constDecDef throw an error.
  2. Follow the Python convention of naming variables that are supposed to be constant with all capital letters. In this case, the Python renderer would still generate regular variable definitions for constDecDef, but would rename the variable to be all caps.

I think I prefer option 2, because with option 1 we would not be able to generate Python code for any example where we have chosen Const as the ConstantRepresentation. Thoughts, @JacquesCarette and @smiths?

JacquesCarette commented 4 years ago

I agree with option 2. Including the renaming to follow Python conventions.

smiths commented 4 years ago

I like option 2. In fact, I think we should think of option 2 for all programming languages. The convention that constants are in ALL_CAPS seems to be a universal convention.

hrzhuang commented 1 year ago

I think there is a distinction to be made here between "constant" and "variable that is not to be mutated."

Examples of the former include mathematical and scientific constants such as $\pi$ and the Planck's constant, in addition to program-specific constants usually declared at the module level such as the length of a fixed-length array. It is generally idiomatic to name these in ALL_CAPS.

However, in C++ in particular, it is also idiomatic to declare as const any local variable that the programmer does not intend to mutate. Consider discriminant in the C++ code below.

std::pair<double, double> solve_quadratic(double a, double b, double c) {
  const double discriminant = b*b - 4*a*c;
  return { (-b + std::sqrt(discriminant)) / (2*a), (-b - std::sqrt(discriminant)) / (2*a) };
}

It would not be idiomatic to name these in ALL_CAPS in C++ or Python. As far as I'm aware, the idiomatic thing to do in Python is to simply declare an ordinary variable, and just not mutate it.

As I understand it, constDecDef relates to the second kind of "immutable variable," since it is a statement that is local to a function body. The first kind of "true constants" should in general occur at module level, and we would declare them using the constVar method of StateVarSym, as opposed to constDecDef.

So I think constDecDef simply generating an ordinary variable declaration in Python is actually completely idiomatic?

balacij commented 1 year ago

That's an interesting difference between naming conventions of constants in mathematics versus programming contexts you found! I never really thought about that. The SRS documents follow the mathematical/science conventions, while the generated code should follow the programming conventions. In programming, ...

Capital letters are used to emphasize that some variables are constants. Since the code currently works, it's not a 'technical issue,' but rather one psychological -- it just increases differentiability between constant and non-constant variables.

PEP8, the Python standard we want (#3221) to follow, requires constants be named with the snakecase uppercase style in particular. Since Python doesn't actually have constants (nor keywords, as you mentioned for C++), the naming convention is particularly useful (psychologically), albeit useless (technically).

This ticket might also apply to Java (final variables are not immutables -- reflection can be used to alter them). Maybe that should be a separate issue if the Java code isn't already in upper case too?

Since C++ has the const keyword, it's not as needed for differentiation between variables and constants compared to Python. However, I believe uppercase variable names are still conventional in C++, and we should similarly generate with uppercase symbols.

That being said, this ticket only discusses Python, so we only need to focus on Python. C++ (and Java) can be a separate issue.

Aside: a constant is just an immutable variable. A variable is a symbol with respect to a scope that may have a value.

balacij commented 1 year ago

One more (aside) issue: I think we generate constants in Python using a carriage class, which actually breaks PEP8 (see the link above), it should be defined on the module-level instead.

JacquesCarette commented 1 year ago

I guess we might need to differentiate between "forever" constants (like $pi$) and things that are constants within the context of the generated program's run-time. (Phrasing it that way also makes it likely that other 'kinds' of constants will appear, with their own notion of when they exist and are constant.)

Because some languages have conventions for naming certain kinds of constants, we're going to have to teach Drasil about this too, and make sure we treat the various kinds appropriately.

[This is fun! I'd never thought about context-sensitive notions of constant before, but now that the issue has been brought up, it makes sense.]

hrzhuang commented 1 year ago

Thank for @balacij for the clarification! I understand that my notions of "true constant" and "immutable variable" are the same on a technical level, but the distinction I'm trying to make is a semantic one. const is a tool for making a variable immutable, but what is our reason for using it? $\pi$ is const because it has one universally agreed-upon value that never changes, and discriminant in my example above is const because we want to do our best to remove ways in which we can shoot ourselves in the foot (by mutating something that doesn't need to be mutated in this case). Note that nothing about the semantics of discriminant is immutable: we can calculate it using mutation as follows.

double discriminant = b*b;
discriminant -= 4*a*c;

But we apply const to it because the calculation could be done without mutation, and that's generally a good practice.

Another way I look at this is through the lens of "scope of immutability." $\pi$ has the same value everywhere it is used, but discriminant takes on a different value for each invocation of solve_quadratic. I think when PEP 8 refer to "constants," they mean constants like $\pi$ (I take their comment that constants "are usually declared on a module level" as a clue). Another reason I think this is that if you consider how the solve_quadratic function would be implemented in Python, I don't think you would capitalize discriminant based on all of the Python code I have seen, some of which comes from talks given by core Python developers.

balacij commented 1 year ago

Constant QDefinitions are meant to be globally scoped constants, which is why they are expected to use the const keyword where possible, and all capital letters at least wherever else. The pure math drawn up by the SRSs don't really make a distinction. For the most part, all variables solved in the IM scheme are expected to be immutable (to my knowledge) once calculated/only assigned once. I have a ticket I have to write about the design of the solution we pull from the SRSs, I'll make sure to reference this later 😊 @JacquesCarette and @smiths can probably answer this better if they have time. Otherwise, I'm sure your questions will naturally reoccur soon.

Thank you, that's a good example. Regarding the issue of C++ and Python locally defined immutable variables, you're right that it's less common in C++.

Note that we're generating a program modelled from a pure, declarative solution schematic.

B-rando1 commented 6 months ago

I saw this issue has the newcomers category, but I couldn't quite follow where the discussion ended. Is there something in here that I can work on?

balacij commented 6 months ago

@B-rando1 and I discussed this in person, to sum up our discussion:

In the context of OO languages, there is a difference between a "constant" variable and an "immutable variable."

Conceptually, I think it's normal to assume they're the same concept (but that could just be me...) but, in OO, where references often exist, "constant" more so corresponds to (as @hrzhuang mentioned) "cannot be re-assigned," while immutable refers to both "cannot be re-assigned" and "its attributes [and their attributes and so on] cannot be altered." For example, global database references should refer to unique databases for specific purposes (e.g., "only store long-term things here [cold data], and use another for caching information [hot data]") and so we wouldn't want people to re-assign them, but we would want them to have an internal state (e.g., open or closed database connection) and at times that state should be mutable (e.g., if the database connection is closed unexpectedly, we can try to reconnect). So, in the context of OO languages, constant =/= immutable. However, this idea of "constant" is not only found in variables...

For example, when developing an OO program (in a "statically-typed" language), ...

In Java, "final" is (normally) the keyword that means "write once, and now (eager)" but can also mean "no more extension," while in C++ and others, "const" is used to mean "write once, and now (eager)." In both, "write once, and now (eager)" are good for declaring "unique and important" pieces of data that can still be mutated themselves. Personally, I think "final" (or even "immutable") is the more appropriate wording for "write once, and now (eager)" (or, like Rust, where variables are default immutable, using "mut" to mark one as mutable), and "inextensible" or "terminal" should probably be used to indicate that a class definition cannot be extended further. Furthermore, I think "constant" should be reserved for things that are constant in the mathematical sense (known and unchanging), and "mutability" should be reserved for contexts where we permit mutation, which I haven't experienced in mathematics.

Ok, now, returning to the conversation:

  1. As @smiths mentioned, it seems like a universal convention in programming languages -- global constants should all be named in an uppercase format. Furthermore, I think that global variables should refer to immutable data too (global mutatable, but "constant" variables complicate testing, readability, and maintenance a lot). So, in the previous sentence "global constants" means "global, immutable data."
  2. I'm not sure what the names should be, but in GOOL, we should probably be able to refer to each of these concepts (discussed above) individually without overloading "constant."
  3. As @hrzhuang mentioned, local "write once" variables should use the normal lowercase naming schemes (even if the variable refers to immutable data too).

What do we think?

On "next steps," I think we first need to survey what features GOOL has related to this and understand its meanings across supported languages against the above discussion (which I think is very doable now), and also survey where constDecDef and other definitions appear in Drasil.

... we might also consider splitting this ticket into a few :smile:

B-rando1 commented 5 months ago

@balacij you mentioned the possibility of "immutable classes", which would define the shape of immutable data, but you aren't aware of any languages that do this. I've been learning Julia, and by default its structs are immutable (and it sounds like Rust is similar in that all variables are immutable by default). That is, any primitive fields in the struct can only be set once, unless the mutable keyword is added when declaring the struct. The tricky part is that for structs holding other structs, only the reference is immutable, so you can have an immutable struct carrying a mutable struct, as below.

julia> mutable struct Unsafe
       num::Real
       end

julia> struct Safe
       unsafe::Unsafe
       end

julia> a = Safe(Unsafe(5))
Safe(Unsafe(5))

julia> a.unsafe
Unsafe(5)

julia> a.unsafe.num
5

julia> a.unsafe.num = 7
7

The last line would have thrown an error if Unsafe was considered immutable in that context.

Anyway, I guess we might want to consider if we should add an "immutable class" constructor to GOOL (or do it some other way) in order to take advantage of this feature of Julia.

JacquesCarette commented 5 months ago

Anyway, I guess we might want to consider if we should add an "immutable class" constructor to GOOL (or do it some other way) in order to take advantage of this feature of Julia.

We might be forced to. This is how a number of features appeared in GOOL: we added a new language that enforced a PL concept that was, at best, previously optional. The various concepts already mentioned by @balacij (immutable, write-once, etc) end up being made explicit in GOOL because the various target languages have their own rules about these things.

The tricky part is that most languages deal with these concepts to various degree of implicitness (but in incompatible ways), so that in GOOL, we have no choice but to be very explicit.

samm82 commented 5 months ago

Going back to the "original" issue of constants in Python, it seems that there may actually be a way to implement constants ("read-only" values) in Python: https://realpython.com/python-constants/#defining-strict-constants-in-python

balacij commented 5 months ago

I will certainly remember the @property decorator for the future, thanks @samm82! That would be handy if/when we decide to build immutable objects in Python. I don't think it's a very common pattern for Python namespaces/global constants, but it could certainly work for them too.

B-rando1 commented 3 months ago

I did some digging around to see what it would take to auto-capitalize constants in Python. On the surface it seems pretty easy: just change the capitalization for constDecDef and constant, and in constDecDef add an additional check to make sure that the capitalized name doesn't conflict with anything. I got that working in #3858, and it actually works perfect in the GOOL tests.

I ran into a couple larger issues in drasil-code, though.

Is this still a change we want to make? It'll take a little bit of work, but it seems doable.

B-rando1 commented 3 months ago

I thought about it some more and decided to try a bit more. It actually was quite a bit easier to make the necessary changes (see them in #3858) than I expected. I think on the whole this change is for the better.

Some things to note:

JacquesCarette commented 3 months ago

So how close are we to being able to close this?

samm82 commented 3 months ago

While the Python renderer does check to see if making a constant's name uppercase would conflict with any existing variable names, it does not check against variable names declared after the constant. So it's still not perfectly safe. With that said, I think we do have all the tools we need to put in more rigorous checking for name conflicts, which would enable greater safety.

@balacij's comment on #806 suggested that a Stateful renderer would help in generating references in the order used in the document. I wonder if a Stateful renderer could help here as well to be able to deal with potential variable conflicts across ALL variables 🤔

B-rando1 commented 3 months ago

@JacquesCarette the original purpose of this issue (capitalizing constants in Python) will be fulfilled once #3858 is passed. There may be a few things we want to change with the fix (i.e. better name conflict detection, throwing warnings rather than errors, and enabling GOOL to know how it renamed a variable without making assumptions). The fix in its current form might be good enough for now, though - it works with everything we have so far.

As for the 'different kinds of constants' discussion we dug up (summarized well by @balacij's comment above), there is still a lot of work to be done. That discussion is enough for its own issue though, so I can open one in the next day or two.

Should I make #3858 close this issue? There are a few 'rough edges' to the fix that should be improved upon at some point. It might be better to make separate issue(s) for those though, as they seem to have applications beyond this issue.

JacquesCarette commented 3 months ago

Yes, please make #3858 close this issue.