chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.76k stars 414 forks source link

chpl foo.chpl -> what executable? #6283

Closed mppf closed 6 years ago

mppf commented 7 years ago

Today, if you compile a .chpl file without providing a -o argument, you get a.out.

// foo.chpl
writeln("Hi");
$ ls
foo.chpl
$ chpl foo.chpl

$ ls
a.out    foo.chpl

I think this is an archaic choice. (the name 'a.out' refers to the predecessor binary format to ELF). It seems better to name the executable foo in this case. A little survey:

Should we change the compiler to emit foo in this case instead of a.out ?

mppf commented 7 years ago

When a module name doesn't match the .chpl file name it's in, we could:

When more than one .chpl file is provided, we could:

I think the last of these options is reasonable and principled. Additionally, that choice seems to imply for me that in the event that the module name does not match the file name, the executable name would be chosen based upon the module name, rather than the file name. In other words, I'm saying a good straw-man here is:

bradcray commented 7 years ago

Yeah, I think if we move away from ./a.out, using the "main module" name (whether set with the flag or inferred using the compiler's heuristics) over the filename is the right move to make. For most cases in which the user doesn't name their module explicitly, that should give non-surprising behavior, and in all other cases, I think it's the most reasonable / least surprising thing.

I was thinking that we actually started out here at the beginning of time and then moved to ./a.out at some point later, but I'm not seeing any evidence of that in the commit log with a quick look. Maybe it was before we were using SCM or it was an idea that didn't find its audience at the time.

jeffhammond commented 7 years ago

This change might have negative consequences if programmers have files with the targeted name that are not the binary. If you play it safe and never overwrite an existing file, then you can't update existing binaries with make. If you don't play it safe, you may destroy any number of files that happen to have the name of the module. While this might be a poor choice by the programmer, I can imagine a reasonable pattern where foo is a script that invokes a.out with some pre and post operations (I know of a quantum chemistry code like this, although the actual binary is not a.out but foo.x).

I can also argue that anyone whose workflow relies about the generation of a.out deserves a PDP-11 😜

ty1027 commented 7 years ago

Very personally, I prefer "a.out" slightly more because I used it very long (and my various tools are customized around it). But if the compiler allows using an arbitrary executable name via option (e.g., -o a.out), I could use "a.out" if necessary (or switch to the new style of using the file name), so whichever is OK for me 👍

bradcray commented 7 years ago

if the compiler allows using an arbitrary executable name via option (e.g., -o a.out)

It does (and if we went this route, would definitely continue to), and like all 'chpl' arguments, it has an environment variable equivalent. So if you were willing to set the environment variable to "a.out" in your .dotfiles/configuration files, you wouldn't even need to throw any extra flags.

mppf commented 7 years ago

A note from Jason Riedy on the mailing list (included here for completeness):

As a data point from a non-github user, both go and ghc also behave this way. I was surprised by go given its heritage, so its change is telling.

Using something other than the file name may be a pain for makefile rules.

ben-albrecht commented 6 years ago

I found this example to be surprising:

// ABC.chpl

module A { }

module B { }

module C {
  proc main() { }
}

I expected the binary to be named ABC after the filename, but instead I got C, which was named after the main module.

Is this the intended behavior, or should I file an issue?

bradcray commented 6 years ago

This is the intended behavior. The binary is named after the main module -- the only reason it typically seems to be named after the file is that when there is no explicitly declared module in a file, the filename becomes the module name. I believe the rationale for this is (a) it's completely unambiguous, (b) large programs are composed of many files, so which one would you choose?, and (c) why did you call your main module C if you didn't want to name your binary that?

(Real men hardcode the output filename to be a.out using CHPL_ settings of course... :) )

ben-albrecht commented 6 years ago

I see this was discussed earlier, now that I've reread the comment more carefully:

For most cases in which the user doesn't name their module explicitly, that should give non-surprising behavior, and in all other cases, I think it's the most reasonable / least surprising thing.

I think this was more surprising for me from the perspective of compiling code I have not looked at (a test) and trying to run the executable (named something different than the filename). Not sure if this is an important use-case to consider, but just another data point, I suppose.

mppf commented 6 years ago

@ben-albrecht - it might be reasonable to have a warning in that case, but when I've run into this error I have also been momentarily surprised but the impact was only a few seconds confusion.

If we find we are compiling and running things a lot, maybe we should add a --execute compilation flag that runs the program after compiling it. Maybe that's an alternative way to address the issue you ran in to.

ben-albrecht commented 6 years ago

it might be reasonable to have a warning in that case

That sounds reasonable, unless there is a case where this pattern is useful - then it becomes an annoyance for anyone using it. Is this a pattern we believe is never viable?

but when I've run into this error I have also been momentarily surprised but the impact was only a few seconds confusion.

I would describe my surprise as more than momentarily. Though, it was compounded by the fact that all the tests in the directory were compiling to the same binary name, because they were testing module visibility.

bradcray commented 6 years ago

A modification to the current behavior that has been discussed is whether we should name the binary not after the main module itself, but rather after the file that contains the main module. Thus:

MyProgram.chpl:

module MyModule {
  proc main() { ... }
}
chpl MyProgram.chpl

would yield a binary MyProgram rather than MyModule. This may take care of some of the cases of confusion that have been reported or hypothesized in the current approach. One question about it might be what the behavior would be if/when we add a Chapel-level equivalent of include. Also it's not entirely clear to me whether this approach is better than the current approach once the user learns the rules. Is there a precedent for it?