sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.43k stars 479 forks source link

deprecate pexpect interface to GAP #26963

Open embray opened 5 years ago

embray commented 5 years ago

Once #26902 is complete, and Sage itself has no internal dependency on the pexpect interface to GAP, it should probably be deprecated entirely.

Completing work on #26902 will likely expose additional, previously unknown deficiencies in the libgap interface, but once those are all resolved there will be little (if any?) advantage to using the pexpect interface.


Here's a proposed strategy for deprecating the gap pexpect interface (henceforth referred to as "gapx" to avoid confusion) and eventually making the libgap interface the default gap interface:

  1. Rename the global variable gap to gapx. I chose the name gapx because it is reminiscent of "gap ex" and also "gap peXpect", but not "xgap" because that is too easily confused with the xgap interface.

  2. The gap global will, for now, still be the pexpect interface, but will produce a deprecation warning stating that in the next version gap will become the libgap interface, and also link to a code porting guide.

  3. After this deprecation period, change the gap global to be the libgap interface (keeping gapx for the pexpect interface). Leave libgap as an alias for gap but don't deprecate it.

?. Do we do anything with the _gap_ and _libgap_ magic methods? I think they should be renamed similarly _gap_ -> _gapx_ but this could be harder to manage.

Depends on #26902

Component: interfaces

Keywords: gap libgap

Issue created by migration from https://trac.sagemath.org/ticket/26963

simon-king-jena commented 5 years ago
comment:1

Note that there may be non-library code relying on the pexpect GAP interface. The current "official" version of p_group_cohomology is an example.

For the next version 3.1, I am moving from pexpect to libgap. Actually I don't recall my original motivation for that change. However, from experience, I can tell (where GAP denotes that software and gap denotes the pexpect interface to GAP):

Concerning string evaluation: A complex piece of code in Sage would often involve tight loops in which myriads of temporary pexpect elements are created, manipulated, and deleted. Creation and deletion costs time; it is better to have one pexpect element tmp and change the underlying data that are stored in GAP/Singular/... in the variable called tmp.name(). In addition, manipulation of said temporary objects is automatically translated into string evaluation --- and very often a carefully written string-to-be-evaluated is better than what is automatically created.

With libgap, efficiency is no more a reason for using string evaluation. That's yet another reason why I think libgap is superior over gap. However, the work involved when switching to libgap may be a reason for other authors of third-party software to avoid the switch and keep using pexpect.

EDIT, as I forgot the conclusion: Please do not remove pexpect too quickly.

embray commented 5 years ago
comment:2

Replying to @simon-king-jena:

Note that there may be non-library code relying on the pexpect GAP interface. The current "official" version of p_group_cohomology is an example.

That's why we have a deprecation process. It would be interesting to look at that package and see in what way it relies on GAP though.

For the next version 3.1, I am moving from pexpect to libgap. Actually I don't recall my original motivation for that change. However, from experience, I can tell (where GAP denotes that software and gap denotes the pexpect interface to GAP):

Oh, well okay :)

  • Overall, libgap is vastly superior to gap. So, generally I'd recommend switching from gap to libgap.
  • One issue in libgap is the evaluation of long strings (which is when pexpect resorts to using temporary files to transfer the string). To see the positive aspect of that issue: It encourages people to not use string evaluation for GAP interface constructions.

You'd have to give an example.

  • Switching from gap to libgap can be a lot of tedious work. Problems include:
    • gap and GAP are 1-based (bad), libgap is 0-based (good). Changing from bad to good is a pain in the neck.

Yep, that's probably the most annoying challenge. I'm not sure how deep that problem goes.

  • libgap.eval is closer to gap.__call__ than to gap.eval; libgap.__call__ is closer to gap.eval than to gap.__call__.

That's an arbitrary design "decision" though that seems to have been made without much reason. It doesn't need to be that way.

  • It is my experience with pexpect that in many cases string evaluation is much more efficient than a "pythonic" usage of the interface; more on that below. When changing to libgap, all these string evaluations need to be completely rewritten.

Concerning string evaluation: A complex piece of code in Sage would often involve tight loops in which myriads of temporary pexpect elements are created, manipulated, and deleted. Creation and deletion costs time; it is better to have one pexpect element tmp and change the underlying data that are stored in GAP/Singular/... in the variable called tmp.name(). In addition, manipulation of said temporary objects is automatically translated into string evaluation --- and very often a carefully written string-to-be-evaluated is better than what is automatically created.

I'm not sure what you're talking about here and would need to see a concrete example. It sounds like you're describing a problem with the pexpect interface, in which case it's a moot point if it's going to be deprecated. But I'm probably not understanding you.

With libgap, efficiency is no more a reason for using string evaluation. That's yet another reason why I think libgap is superior over gap. However, the work involved when switching to libgap may be a reason for other authors of third-party software to avoid the switch and keep using pexpect.

Again, hence, a deprecation period. I don't suspect it would be a massive overhaul for most uses but that's why I ask: I don't know for sure and would need to see examples.

EDIT, as I forgot the conclusion: Please do not remove pexpect too quickly.

The title of this ticket is "deprecate", not "remove". A follow-up "remove" ticket would only be made once deprecation is complete, and even then removal would not be for at least a year.

dimpase commented 5 years ago
comment:3

oops, sorry, ignore this one.

simon-king-jena commented 5 years ago
comment:4

Replying to @embray:

Replying to @simon-king-jena:

Note that there may be non-library code relying on the pexpect GAP interface. The current "official" version of p_group_cohomology is an example.

That's why we have a deprecation process. It would be interesting to look at that package and see in what way it relies on GAP though.

See #26001, if you are interested. Note that the group cohomology package uses gap mainly when creating initial data of a cohomology ring and initial data of an induced homomorphism. So, the main part of the computations doesn't rely on GAP. The actual work is done with my code and with Singular (currently via pexpect).

  • One issue in libgap is the evaluation of long strings (which is when pexpect resorts to using temporary files to transfer the string). To see the positive aspect of that issue: It encourages people to not use string evaluation for GAP interface constructions.

You'd have to give an example.

That's interesting. I just tried to reconstruct it from an example that I met during my cohomology computations. But I find that even the evaluation of a string of length 1974754 is very quick. Have there been some recent improvement, in the past four months?

  • libgap.eval is closer to gap.__call__ than to gap.eval; libgap.__call__ is closer to gap.eval than to gap.__call__.

That's an arbitrary design "decision" though that seems to have been made without much reason. It doesn't need to be that way.

Now that the decision was made, it shouldn't be changed back, IMHO.

I'm not sure what you're talking about here and would need to see a concrete example. It sounds like you're describing a problem with the pexpect interface, in which case it's a moot point if it's going to be deprecated. But I'm probably not understanding you.

Indeed I was talking about ways of coding that used to be needed to overcome shortcomings of the pexpect interface. libgap doesn't have these shortcomings. However, it isn't a moot point: Translating the old work-around-pexpect-flaws code into make-the-best-out-of-a-C-library code is quite tedious, and more tedious than changing from 1-based to 0-based lists.

The point only becomes moot since apparently you are not going to rip out pexpect from the next release.

embray commented 5 years ago
comment:5

Replying to @simon-king-jena:

Replying to @embray:

  • libgap.eval is closer to gap.__call__ than to gap.eval; libgap.__call__ is closer to gap.eval than to gap.__call__.

That's an arbitrary design "decision" though that seems to have been made without much reason. It doesn't need to be that way.

Now that the decision was made, it shouldn't be changed back, IMHO.

That of course assumes an actual "decision" was made, which might be generous :) I'd rather any and all library interfaces in Sage be changed to actually have some consistency in their behavior. There are lots of nooks and crannies like this is Sage where, over the years, care has not been taken to ensure consistency in design. Better to fix it and for all, and also document some standard behavior that other interfaces should follow. This was done better, incidentally, in the case of the pexpect interfaces.

Of course, doing this in a way that doesn't break backwards compatibility is tricky, and will require care. I'm not sure what the best answer is.

embray commented 5 years ago
comment:6

Looking at the p_group_cohomology sources (is there really no better way to get them then downloading the tarball through Sage??) I can see it would be a fair bit of work to do the conversion, though most of it should be doable mechanically. There's a lot of other stuff in there that could be written much more cleanly. Seems like it would be a good opportunity.

simon-king-jena commented 5 years ago
comment:7

Replying to @embray:

Looking at the p_group_cohomology sources (is there really no better way to get them then downloading the tarball through Sage??) I can see it would be a fair bit of work to do the conversion, though most of it should be doable mechanically. There's a lot of other stuff in there that could be written much more cleanly. Seems like it would be a good opportunity.

As I have mentioned on various occasions, I already have changed p_group_cohomology from pexpect gap to libgap. See #26001, which according to Volker has a merge conflict (he didn't tell what it conflicts with).

p_group_cohomology is a genuine SageMath package and cannot be used independently of SageMath. So, till today I didn't bother to link to the source tarball from my homepages, but I suppose that the package documentation can be found by google or duckduckgo.

Anyway, my home pages provide links to the documentation and (since today) also to the source tarball of the package (version 3.1, hence, already with libgap).

simon-king-jena commented 5 years ago
comment:8

Replying to @embray:

I can see it would be a fair bit of work to do the conversion, though most of it should be doable mechanically.

Not unless you have an excellent knowledge of regular expressions. Most of the transition was doable manually in two days or so.

embray commented 5 years ago
comment:9

I know how to search. I just meant the source tarball wasn't readily available anywhere else. Being an open source project I don't know why the repository isn't either.

embray commented 5 years ago
comment:10

Replying to @simon-king-jena:

Replying to @embray:

I can see it would be a fair bit of work to do the conversion, though most of it should be doable mechanically.

Not unless you have an excellent knowledge of regular expressions. Most of the transition was doable manually in two days or so.

I do. For changes that require more than a regexp the use of macros is also possible. So if you've already done the translation I don't know what the problem is.

simon-king-jena commented 5 years ago
comment:11

Replying to @embray:

I know how to search. I just meant the source tarball wasn't readily available anywhere else. Being an open source project I don't know why the repository isn't either.

Simply I don't know how to make the repository public without paying for it.

simon-king-jena commented 5 years ago
comment:12

Replying to @embray:

So if you've already done the translation I don't know what the problem is.

In comment:1, I think I made clear enough that I did not ask "how to change p_group_cohomology from gap to libgap". I was merely using p_group_cohomology as an example for non-library SageMath code.

So, "what is the problem"? The problem from my perspective was that in the past most of p_group_cohomology's versions had to cope with backward incompatible changes in the SageMath API (where I define API to include all class definitions in .pxd files, all import locations and names in SageMath, and the infrastructure to build packages) and by changes in Singular.

It has happened that I developed a new version coping with changes in SageMath, and while the new version was being reviewed new changes in SageMath forced me to change the package-being-reviewed yet again.

Frankly, about one or two years ago I was so annoyed that I was very close to drop SageMath.

So, the request that I was making in comment:1 boils down to: When doing non-trivial changes in Sage it isn't enough to fix the Sage library code. Keep in mind that there are people building code on top of SageMath (I suppose I am not the only developer who does). Make the transition easy for people and keep SageMath something that people can stably rely on --- or you'll lose them.

embray commented 5 years ago
comment:13

I'm sorry you had that experience but that's software development. Welcome to my life. Of course there are some legitimate issues you bring up here and I personally agree with everything you write with some nitpicks:

1) Sage has not traditionally had an "API", and your definition thereof is slightly arbitrary because nobody else defined Sage's API in that way. IMO (and this is also arbitrary but at least typical) an API for a Python package comprises all documented, non-underscored classes, functions, modules, etc. There is a bad habit with a lot of methods in Sage not being internal when they should have been.

2) There tends to be poor planning in the development of Sage in general: I believe this stems from a lack of leadership and communication.

3) Most of the problems you discuss have to do with proper care not taken in deprecation and backwards compatibility. The problems you mention with Singular probably has less to do with Sage itself--in terms of the library--but rather the development model the Sage community uses of having its own package manager, essentially, and depending strictly, and only strictly, on precise versions of its dependencies. The problems you had probably could have been mitigated by maintaining better backwards compatibility for older versions of Singular and not forcing you to upgrade. It also would help if Sage had more minor releases that included bug fixes and new features without making major breaking changes. I have argued this since I first came on to the project and its release managers seem to disagree, but they're wrong.

We're making this same mistake over again with GAP, but right now there isn't time to do it any other way. Of course, to some extent Sage can't be held responsible for changes to GAP that are not backwards compatible: Only the interfaces in Sage that use GAP are Sage's responsibility to keep working.

4) All that said, this is why we have deprecation periods. This is normal. You can't can't continue to provide support for old, difficult to maintain code forever, especially with a mostly volunteer project, especially when there is a superior replacement.

simon-king-jena commented 5 years ago
comment:14

Replying to @embray:

3) ... It also would help if Sage had more minor releases that included bug fixes and new features without making major breaking changes.

+1. When I started work in Sage (around 10 years ago), I had the impression that there is a realistic chance that when I contributed a bug fix it could be found in a release just 14 days later. I found that very encouraging.

4) All that said, this is why we have deprecation periods. This is normal. You can't can't continue to provide support for old, difficult to maintain code forever, especially with a mostly volunteer project, especially when there is a superior replacement.

Also +1. In older versions of p_group_cohomology, I was trying to maintain compatibility for several old Singular versions. I.e., I detected the version and used different code branches for each version, working around the bugs that I knew of in the different versions. But that was with the old spkg and thus I could envision people installing the latest p_group_cohomology version in an old Sage version.

With the new Sage package structure, one can be virtually sure that a user cannot install the package in a Sage version older than the one for which the package was reviewed.

dimpase commented 5 years ago
comment:15

Replying to @simon-king-jena:

Replying to @embray:

I know how to search. I just meant the source tarball wasn't readily available anywhere else. Being an open source project I don't know why the repository isn't either.

Simply I don't know how to make the repository public without paying for it.

Github and gitlab are just for this, by default public and free. Create a git repo, with your code on github, say, make a release of the code (this is pushing 1 button :-)) (and you can attach tarballs with other/more stuff to releases, too)

embray commented 3 years ago

Changed keywords from none to gap libgap

embray commented 3 years ago
comment:16

This ticket went off the rails a bit last time it was discussed. AFAICT Simon's p_group_cohomology now fully supports the libgap interface, and deprecating the pexpect interface will not have any major impact on it.

I don't know how much other code is out there, if any, which makes heavy use of either interface. Nevertheless, I can add to the documentation a small guide for porting from pexpect gap to libgap--Simon's experience with this provided a number of helpful examples.

I also don't propose removing the pexpect interface for now, just a roadmap for changing the gap global in Sage to use the libgap interface.

embray commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,17 @@
 Once #26902 is complete, and Sage itself has no internal dependency on the pexpect interface to GAP, it should probably be deprecated entirely.

 Completing work on #26902 will likely expose additional, previously unknown deficiencies in the libgap interface, but once those are all resolved there will be little (if any?) advantage to using the pexpect interface.
+
+---
+
+Here's a proposed strategy for deprecating the gap pexpect interface (henceforth referred to as "gapx" to avoid confusion) and eventually making the libgap interface the default gap interface:
+
+1. Rename the global variable `gap` to `gapx`.  I chose the name `gapx` because it is reminiscent of "gap ex" and also "gap peXpect", but not "xgap" because that is too easily confused with the xgap interface.
+
+2. The `gap` global will, for now, still be the pexpect interface, but will produce a deprecation warning stating that in the next version `gap` will become the `libgap` interface, and also link to a code porting guide.
+
+3. After this deprecation period, change the `gap` global to be the libgap interface (keeping `gapx` for the pexpect interface).  Leave `libgap` as an alias for `gap` but don't deprecate it.
+
+?. Do we do anything with the `_gap_` and `_libgap_` magic methods?  I think they should be renamed similarly `_gap_` -> `_gapx_` but this could be harder to manage.
+
+
simon-king-jena commented 3 years ago
comment:17

Replying to @embray:

This ticket went off the rails a bit last time it was discussed. AFAICT Simon's p_group_cohomology now fully supports the libgap interface, and deprecating the pexpect interface will not have any major impact on it.

That's right, I switched to libgap. So, p_group_cohomology is no obstacle against a deprecation of gap-via-pexpect.