sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.43k stars 478 forks source link

Remove the "-r" option in calling gap in the pexpect interface #27878

Open kiwifb opened 5 years ago

kiwifb commented 5 years ago

Currently the pexpect interface to gap call gap with the "-r" option. From sage/interfaces/gap.py

gap_cmd = "gap -r"

This is a remnant of gap 4.8 and under where it enabled gap to be started in a quieter mode if I am not mistaken. There is no visible effect in starting gap with or without "-r" in gap 4.10 and over.

fbissey@moonloop ~ $ gap
 ┌───────┐   GAP 4.10.1 of 23-Feb-2019
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: amd64
 Configuration:  gmp 6.1.2, readline
 Loading the library and packages ...
 Packages:   AClib 1.3.1, Alnuth 3.1.0, AtlasRep 1.5.1, AutoDoc 2019.02.22, AutPGrp 1.10, CRISP 1.4.4, Cryst 4.1.18, CrystCat 1.1.8, CTblLib 1.2.2, FactInt 1.6.2, FGA 1.4.0, 
             GAPDoc 1.6.2, IO 4.5.4, IRREDSOL 1.4, LAGUNA 3.9.2, Polenta 1.3.8, Polycyclic 2.14, PrimGrp 3.3.2, RadiRoot 2.8, ResClasses 4.7.1, SmallGrp 1.3, Sophus 1.24, 
             TomLib 1.2.7, TransGrp 2.0.4, utils 0.61
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'
gap> 
fbissey@moonloop ~ $ gap -r
 ┌───────┐   GAP 4.10.1 of 23-Feb-2019
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: amd64
 Configuration:  gmp 6.1.2, readline
 Loading the library and packages ...
 Packages:   AClib 1.3.1, Alnuth 3.1.0, AtlasRep 1.5.1, AutoDoc 2019.02.22, AutPGrp 1.10, CRISP 1.4.4, Cryst 4.1.18, CrystCat 1.1.8, CTblLib 1.2.2, FactInt 1.6.2, FGA 1.4.0, 
             GAPDoc 1.6.2, IO 4.5.4, IRREDSOL 1.4, LAGUNA 3.9.2, Polenta 1.3.8, Polycyclic 2.14, PrimGrp 3.3.2, RadiRoot 2.8, ResClasses 4.7.1, SmallGrp 1.3, Sophus 1.24, 
             TomLib 1.2.7, TransGrp 2.0.4, utils 0.61
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'
gap> 

The only effect is more subtle. ~/.gap and its equivalent on non linux system is not added to the list of paths where gap looks for packages. This means in turn that the option ignore_dot_gap in the function all_installed_packages (in sage/tests/gap_packages.py) is meaningless when a pexpect gap instance is selected. And a doctest introduced in #27681 will fail if you have something in ~/.gap.

Component: interfaces

Author: François Bissey

Branch/Commit: u/fbissey/gap_r @ 474b365

Reviewer: Steven Trogdon

Issue created by migration from https://trac.sagemath.org/ticket/27878

kiwifb commented 5 years ago

Description changed:

--- 
+++ 
@@ -30,4 +30,4 @@
  Try '??help' for help. See also '?copyright', '?cite' and '?authors'
 gap> 

-The only effect is more subtle. ~/.gap and its equivalent on non linux system is not added to the list of paths where gap looks for packages. This means in turn that the option ignore_dot_gap in the function all_installed_packages is meaningless when a pexpect gap instance is selected. And a doctest introduced in #27681 will fail if you have something in ~/.gap. +The only effect is more subtle. ~/.gap and its equivalent on non linux system is not added to the list of paths where gap looks for packages. This means in turn that the option ignore_dot_gap in the function all_installed_packages (in sage/tests/gap_packages.py) is meaningless when a pexpect gap instance is selected. And a doctest introduced in #27681 will fail if you have something in ~/.gap.

kiwifb commented 5 years ago

Commit: 8958e4d

kiwifb commented 5 years ago

Branch: u/fbissey/gap_r

kiwifb commented 5 years ago
comment:2

Trivial change.


New commits:

8958e4dremove extraneous -r from gap_cmd
kiwifb commented 5 years ago

Author: François Bissey

strogdon commented 5 years ago

Reviewer: Steven Trogdon

strogdon commented 5 years ago
comment:3

Here I have

$ ls -R ~/.gap
/home/volj/27/strogdon/.gap:
pkg

/home/volj/27/strogdon/.gap/pkg:
AtlasRep

/home/volj/27/strogdon/.gap/pkg/AtlasRep:
datagens  dataword

/home/volj/27/strogdon/.gap/pkg/AtlasRep/datagens:

/home/volj/27/strogdon/.gap/pkg/AtlasRep/dataword:

The doctest ./sage -t --long src/sage/tests/gap_packages.py, with this branch, passes which would fail without the fix.

Hopefully the positive review will prompt someone to notice if there is any side effect.

dimpase commented 5 years ago
comment:4
$ gap -h
...
 -r                           disable/enable user GAP root dir
                              GAPInfo.UserGapRoot

also

  The  root  directories  are  specified via one or several of the -l paths command line options, see 3.1. Furthermore, by default GAP automatically prepends a user
  specific GAP root directory to the list; this can be avoided by calling GAP with the -r option. The name of this user specific directory depends on your operating
  system, it can be found in GAPInfo.UserGapRoot. This directory can be used to tell GAP about personal preferences, to always load some additional code, to install
  additional packages, or to overwrite some GAP files. See 3.2 for more information how to do this.
embray commented 5 years ago
comment:5

I don't quite understand the intention behind this change. When using the pexpect interface to GAP, which much of Sage still (sadly) relies on (I am making progress fixing that) we want the -r flag because we don't want random stuff a user might have in their ~/.gap interfering with Sage functionality.

embray commented 5 years ago
comment:6

ISTM the problem is not that we pass -r in the pexpect interface, but that we aren't passing -r to the libgap interface.

kiwifb commented 5 years ago
comment:7

My problem with putting -r is that we are removing some capabilities from gap. Of course for other packages where the user can make changes the custom space is moved, usually by environment variables, to .sage. And we aren't quite able to do that with gap. Anyway the big concern is for protect the user from themselves and limit our debugging. I am not in favor of protecting the user from themselves to that degree, but we should have a way to a safe mode for debugging..

nthiery commented 5 years ago
comment:8

Indeed, as a user, I would be annoyed if I could not use from Sage the GAP packages I installed in my gap directory (e.g. by GAP's PackageManager). The GAP devs have decided to include the user's .gap directory by default; I would tend to follow that decision to have a GAP that's as close as possible from a normal gap.

That being said, maybe we should indeed set -r whenever running Sage's test, to make them less context-dependent? With the annoying side effect that behavior may change between tests and normal usage.

vbraun commented 5 years ago

Changed branch from u/fbissey/gap_r to 8958e4d

vbraun commented 5 years ago

Changed commit from 8958e4d to none

embray commented 5 years ago
comment:12

As I wrote on the other ticket (#27913):

I believe the opposite should be the case: By default the GAP interfaces in Sage should ignore a user .gap because it could potentially interfere with the functionality of Sage itself.

The problem is there is a tension here between Sage embedding GAP for its own uses, and users of Sage wanting to use GAP from within Sage and have it work the same as if they used GAP directly.

I think for the latter case, users should initialize their own GAP via the pexpect interface, and perhaps we should change things so that it's not necessarily the same GAP that Sage is using internally.

In the latter case I've also made very good progress getting the pexpect interface out of Sage's internal use cases (in this case for permutation groups) and porting it to libgap. I'll have some tickets for that soon; I've just been very otherwise occupied.

embray commented 5 years ago
comment:13

In any case, while this clearly needs to be discussed further the right answer is definitely not to just disable use of -r completely, as that is only likely to break more things when testing.

I agree with nthiery that one way or other ignoring the user's .gap should definitely be the default when running the test suite. The rest is debatable.

nthiery commented 5 years ago
comment:14

I used the occasion of being in St Andrews to discuss the matter with GAP developers (namely Alex Konovolov and Michael Torpey). When running GAP tests, they disable the .gap directory with -r and suggest, as we discussed, to do the same in Sage. Otherwise, they recommend to not set the option -r by default. Here is their rationale:

Cheers,

kiwifb commented 5 years ago
comment:15

We may want to have an option to inject -r for diagnostics. Which to my mind is one of its use. It gives you a rescue mode. If things works once you enter that mode then you know you messed up your .gap folder.

gap is behaving a bit differently than say R where you have to explicitly load packages rather than them being loaded, mostly, on presence. This is more of a question for upstream I guess.

nthiery commented 5 years ago
comment:16

Replying to @kiwifb:

We may want to have an option to inject -r for diagnostics. Which to my mind is one of its use. It gives you a rescue mode. If things works once you enter that mode then you know you messed up your .gap folder.

Indeed! I forgot to mention that the GAP dev recommended that too.

gap is behaving a bit differently than say R where you have to explicitly load packages rather than them being loaded, mostly, on presence. This is more of a question for upstream I guess.

Good point. Same for Python packages: nothing happens until you explicitly import them. I'll chat with Michael Torpey tomorrow about when GAP packages installed with PackageManager shall be autoloaded or not.

nthiery commented 5 years ago
comment:17

Feedback from Alex and Michael: it used to be that GAP package authors could control whether their package was loaded automatically, and many if not most did. Nowadays, the normal behavior is not to autoload (like in e.g. Python or R), unless GAP core or some other autoloaded package requires it. So most recent packages don't autoload. But many old packages still autoload, for backward compatibility.

kiwifb commented 5 years ago
comment:18

That's nice. So it we can expect things to settle down slowly on that front. We just have to bear it for a little while. So now we need a mechanism to insert -r on request. There is already some mechanism for some other options I think, so it should just be modeled on them.

kiwifb commented 5 years ago
comment:19

OK wanting some feedback on this. So I still make the default to be run without -r but I introduce an option no_dotgap which default to False to enable it. I also fixed the doctest that triggered the issue in the first place so that ~/.gap is always ignored in that test.


New commits:

8958e4dremove extraneous -r from gap_cmd
cab2211Add an option to restore -r and document it
474b365Make the all_installed_package test ignore .gap content.
kiwifb commented 5 years ago

Changed branch from 8958e4d to u/fbissey/gap_r

kiwifb commented 5 years ago

Commit: 474b365

nthiery commented 5 years ago
comment:20

Thanks François! This sounds very reasonable to me. There remains to force -r whenever running doctests, as discussed above.

For the option configuration, I am hesitating between two approaches. With the current commits, we abstract away from the user the specifics of how to tell gap to ignore .gap. This has some merit.

An alternative would be to provide a mean for the user to configure the arguments to the gap command; something like gap_options = '-r', '-t', .... And to put in the documentation an example: "if you want GAP to ignore the .gap directory, you can use its -r option". This has the merit of being more flexible, and to empower GAP users that may already know of this option or others.

What do you think?

kiwifb commented 5 years ago
comment:21

I am not sure. First we are only talking about the pexpect interface in this ticket, Erik already mentioned there is currently no way to force libgap to ignore ~/.gap. In the pexpect interface we already feed gap many options to get exactly what we want and a quiet interface - I was looking at the code in details earlier today. So dabbling with options is really reserved to people who know what they are doing.

Lastly, I introduced an option because this is something we want to have some programmatic control over. But it could have been achieved some different way. The documentation just over the one I introduced for no_dotgap reads

Use this code to change which GAP interpreter is run. E.g.,
::
       import sage.interfaces.gap
       sage.interfaces.gap.gap_cmd = "/usr/local/bin/gap"

This could be used to pass any options to gap

sage.interfaces.gap.gap_cmd = "gap -r -X -Y -myweirdoption"

rather than a gap in a different path.

embray commented 5 years ago
comment:22

Principle of less surprise: keep GAP in Sage as similar as possible to GAP outside of Sage.

You have to be clear here what you mean by "GAP in Sage". The GAP interpreter included in the Sage distribution should work as similarly as possibly to a GAP distributed by the GAP project. I.e. when you run gap directly (I'll call this case A).

Then there's the use of GAP within the Sage library itself which as I've mentioned has competing use-cases: the GAP interface in Sage used directly by users as an interface to GAP without leaving the Sage interpreter (B), and then there's GAP used directly by Sage for internal use (C).

This principle applies to A and B, but emphatically not for C (even if there are ostensibly benefits to doing so, such as packages that provide alternative implementations of some algorithms that happen to be used by Sage; if that capability exists it should be enabled explicitly through Sage).

I think that the gap global made available to users in Sage should not be the same as that used internally by Sage. I think a lot of problems like this can be resolved cleanly by not crossing these use-cases where it isn't necessary to.

embray commented 5 years ago
comment:23

If I had my way I would also be more aggressive about controlling exactly what GAP packages are loaded by the GAP interpreter used by Sage, because my experience has been that random GAP packages being loaded can lead to very difficult to trace bugs or differences in results.

embray commented 5 years ago
comment:24

Replying to @nthiery:

Feedback from Alex and Michael: it used to be that GAP package authors could control whether their package was loaded automatically, and many if not most did. Nowadays, the normal behavior is not to autoload (like in e.g. Python or R), unless GAP core or some other autoloaded package requires it. So most recent packages don't autoload. But many old packages still autoload, for backward compatibility.

Yep--in particular autoloading of the xgap package has been a real problem for Sage's pexpect interface.

embray commented 5 years ago
comment:25
diff --git a/src/sage/interfaces/gap.py b/src/sage/interfaces/gap.py
index a7a9fab..43c957b 100644
--- a/src/sage/interfaces/gap.py
+++ b/src/sage/interfaces/gap.py
@@ -150,6 +150,12 @@ Use this code to change which GAP interpreter is run. E.g.,
        import sage.interfaces.gap
        sage.interfaces.gap.gap_cmd = "/usr/local/bin/gap"

+Disabling the use of ~/.gap - no personal packages or settings used.
+
+::
+       import sage.interfaces.gap
+       sage.interfaces.gap.no_dotgap = True
+

I might be missing something (or maybe this is only a proposal?) but this doesn't look right.

The instructions above would only create a global variable no_dotgap in the sage.interfaces.gap module, which isn't used. Instead, one would call something like:

sage: g = Gap(no_dotgap=True)

to initialize a GAP interface with that option set.

I'm also not sure about the name no_dotgap. I've been trying (and sometimes it's hard) to avoid predicates with a negative proposition, which leads to confusing double-negatives (no_dotgap=False to enable the .gap directory). I would instead call it something like enable_dotgap=True (and enable it by default). Therefore any Gap() initialized by a user would use .gap which I absolutely agree is the desirable default behavior.

Only the GAP interface used internally for Sage would set enable_dotgap=False).

For libgap we're in a bit of a bind, but I think that the most important use case for the libgap interface in Sage is for interfacing with GAP internally (to replace the pexpect interface as much as possible), so this should disable .gap by default.

kiwifb commented 5 years ago
comment:26

Replying to @embray:

Replying to @nthiery:

Feedback from Alex and Michael: it used to be that GAP package authors could control whether their package was loaded automatically, and many if not most did. Nowadays, the normal behavior is not to autoload (like in e.g. Python or R), unless GAP core or some other autoloaded package requires it. So most recent packages don't autoload. But many old packages still autoload, for backward compatibility.

Yep--in particular autoloading of the xgap package has been a real problem for Sage's pexpect interface.

Now that's a painful flashback from before you started being involved with sage Erik. It took me a while to figure out what was happening to my sage-on-gentoo install where I had installed the whole gap distribution.

I'll concede that your point (C) sounds reasonable but has the after taste of the package you customise specifically for sage (or some other software).

Off course we want to move all of the use of point (C) to libgap I would imagine , which leaves the pexpect interface at your point (B). But having upstream giving you the tools to get to (C) would be nice.

I'll note that gap packages have the concept of recommended packages. Those are not dependencies but considered helpful but automatically loaded on presence which is harmful in our case (xgap is recommended in several packages).

kiwifb commented 5 years ago
comment:27

Replying to @embray:

diff --git a/src/sage/interfaces/gap.py b/src/sage/interfaces/gap.py
index a7a9fab..43c957b 100644
--- a/src/sage/interfaces/gap.py
+++ b/src/sage/interfaces/gap.py
@@ -150,6 +150,12 @@ Use this code to change which GAP interpreter is run. E.g.,
        import sage.interfaces.gap
        sage.interfaces.gap.gap_cmd = "/usr/local/bin/gap"

+Disabling the use of ~/.gap - no personal packages or settings used.
+
+::
+       import sage.interfaces.gap
+       sage.interfaces.gap.no_dotgap = True
+

I might be missing something (or maybe this is only a proposal?) but this doesn't look right.

The instructions above would only create a global variable no_dotgap in the sage.interfaces.gap module, which isn't used. Instead, one would call something like:

sage: g = Gap(no_dotgap=True)

to initialize a GAP interface with that option set.

I'm also not sure about the name no_dotgap. I've been trying (and sometimes it's hard) to avoid predicates with a negative proposition, which leads to confusing double-negatives (no_dotgap=False to enable the .gap directory). I would instead call it something like enable_dotgap=True (and enable it by default). Therefore any Gap() initialized by a user would use .gap which I absolutely agree is the desirable default behavior.

Only the GAP interface used internally for Sage would set enable_dotgap=False).

For libgap we're in a bit of a bind, but I think that the most important use case for the libgap interface in Sage is for interfacing with GAP internally (to replace the pexpect interface as much as possible), so this should disable .gap by default.

Proposal at this stage. I must say I have been a bit confused and modeled my documentation on the wrong model now that I look back at it.

The name. At first I was going for use_dotgap (similar to enable_dotgap) but I had to use it negatively which I didn't like at all. What about disable_dotgap?

kiwifb commented 5 years ago
comment:28

I am trying to answer too fast. I see you are trying to avoid negative settings and I can understand that. I guess we just have to bear the negative use internally.

embray commented 5 years ago
comment:29

I am testing an alternative based on my comments.

embray commented 5 years ago
comment:30

Replying to @kiwifb:

I am trying to answer too fast. I see you are trying to avoid negative settings and I can understand that. I guess we just have to bear the negative use internally.

Yeah; the sense of the command-line parameter is negative, so this can be documented internally; but that doesn't mean the user-interface in Sage needs to be as well :)

embray commented 5 years ago
comment:31

Replying to @kiwifb:

Replying to @embray:

Replying to @nthiery:

Feedback from Alex and Michael: it used to be that GAP package authors could control whether their package was loaded automatically, and many if not most did. Nowadays, the normal behavior is not to autoload (like in e.g. Python or R), unless GAP core or some other autoloaded package requires it. So most recent packages don't autoload. But many old packages still autoload, for backward compatibility.

Yep--in particular autoloading of the xgap package has been a real problem for Sage's pexpect interface.

Now that's a painful flashback from before you started being involved with sage Erik. It took me a while to figure out what was happening to my sage-on-gentoo install where I had installed the whole gap distribution.

I'll concede that your point (C) sounds reasonable but has the after taste of the package you customise specifically for sage (or some other software).

Off course we want to move all of the use of point (C) to libgap I would imagine , which leaves the pexpect interface at your point (B). But having upstream giving you the tools to get to (C) would be nice.

Indeed it would be nice; but it's admittedly not easy. I've been watching the efforts over the last few years to make multiple parallel embedded Python interpreters possible from libpython, and while it mostly works now it was not a trivial effort. I think it's maybe a little easier for GAP than it was for Python, especially with HPC-GAP, where most of the important interpreter globals have been moved into thread-local variables. But it's still non-trivial, especially when you involve things like C extension modules :/

It's something I would gladly help work on if I had the time to do so, which I don't (and I doubt most of the GAP developers do either, even if it would likely be a desirable goal).

embray commented 5 years ago
comment:32

The other complication to this that I haven't addressed yet is that GAP should probably also load a different workspace (if any) for when it's running with or without the .gap directory loaded, or this could lead to even more confusing inconsistencies :/

Perhaps the arguments passed when starting GAP should be hashed into the hash in the workspace filename, or something? I'm not confident about this part as I still don't fully understand how GAP workspaces work (just enough to know that they can definitely be influenced heavily by what pakages and globals are loaded).

embray commented 5 years ago
comment:33

Replying to @embray:

Principle of less surprise: keep GAP in Sage as similar as possible to GAP outside of Sage.

I think that the gap global made available to users in Sage should not be the same as that used internally by Sage. I think a lot of problems like this can be resolved cleanly by not crossing these use-cases where it isn't necessary to.

I see now, after experimenting with this, that there is a technical challenge to it (I still think it is the right approach in principle). The problem is that although it is possible to have multiple Gap interpreter instances, much of the code in Sage makes inconsistent assumptions that you are only using the "main" instance, currently the global variable named gap.

For example, SageObjects can have both a _gap_ method, and a _gap_init_ method. The former returns the object's GAP representation wrapped in a GapElement. It can take an optional Gap instance as an argument: The GAP interpreter instance in which to create the element. The latter, _gap_init_, only returns a string of GAP code to be evaluated in the GAP interpreter to instantiate the object. This is consistent with how other interfaces work.

The problem (or one of them) is that _gap_init_ does not take an optional Gap instance, and just always assumes you're using the "main" gap instance. This can lead to all kinds of inconsistencies, as sometimes the _gap_init_ methods themselves query the Gap instance (e.g. what is the current $sage variable number) for details on how to evaluate the object. So if you use one Gap instance in the _gap_init_ call, but then pass the result of that call to a different Gap instance it is not guaranteed to work.

For this reason, among others, I would propose that we keep the current status quo w.r.t. disabling .gap by default for now, but add the enable_dotgap option, so that it's easy to create a new Gap instance that uses it. Then change the default behavior only once further refactoring is done to maintain better separation between the (B) and (C) use cases.

kiwifb commented 5 years ago
comment:34

All right. I think it is also important to have the doctest that started it all to be fixed quickly.

nthiery commented 5 years ago
comment:35

Replying to @embray:

For libgap we're in a bit of a bind, but I think that the most important use case for the libgap interface in Sage is for interfacing with GAP internally (to replace the pexpect interface as much as possible), so this should disable .gap by default.

In my research code (and in packages such as sage-semigroups), I really need to be using libgap (rather than a pexpect interface) for efficiency. I believe that's a common use case.

I see the rationale for aiming two separate gap instances, an incorruptible one for internal use, and a more flexible one for direct access by users. I don't know whether that's technically feasible at this stage though with libgap. Also we still want to be able to pass gap objects efficiently from internal code to user code.

dimpase commented 5 years ago
comment:36

the sooneer a complete switch to libgap happens in sagelib, the better.

I don't quite understand what "pass gap objects efficiently from internal code to user code" means here. IMHO this all fine in libgap, no?

nthiery commented 5 years ago
comment:37

the sooneer a complete switch to libgap happens in sagelib, the better.

Yes, indeed!

I don't quite understand what "pass gap objects efficiently from internal code to user code" means here. IMHO this all fine in libgap, no?

Currently yes, indeed. Now assume we were to have two separate instances of libgap, one for sagelib and one user-facing, to protect the former from corruption by e.g. user packages; then, it may become costly to move one object from one instance to the other.

dimpase commented 5 years ago
comment:38

I'm not sure many instances of libgap loaded as python extensions are technically feasible; this seem to mean that one would have to mangle names in the dlls to avoid clashes etc.

What might be feasible are several libgap workspaces.

embray commented 5 years ago
comment:39

Tickets still needing working or clarification should be moved to the next release milestone at the soonest (please feel free to revert if you think the ticket is close to being resolved).

kiwifb commented 5 years ago
comment:40

I am opening a new ticket to fix the doctest in gap_packages under all scenario #28233. Let's keep this ticket for the serious stuff about how we are supposed to run gap.