sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.47k stars 487 forks source link

Chart: Implement pickling via __getstate__/__setstate__ #31901

Open mkoeppe opened 3 years ago

mkoeppe commented 3 years ago

In this ticket, we implement the pickling protocol using __getstate__ and __setstate__

Depends on #31894

CC: @egourgoulhon @tscrim @mjungmath

Component: manifolds

Work Issues: redo on top of #31894

Branch/Commit: u/mkoeppe/chartno_longer_use_uniquerepresentation_implementgetstate___setstate @ 30a5d0b

Issue created by migration from https://trac.sagemath.org/ticket/31901

mkoeppe commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,4 +1,7 @@
-Chart is missing `__classcall__` to normalize init arguments.
+Because of the method `add_restrictions` (which changes `self` by adding codomain restrictions), `Chart` needs to be considered as a mutable class. The current pickling based on `UniqueRepresentation` therefore cannot work correctly. 

-Perhaps all string parsing in `_init_coordinates` should already be done in `__classcall__`.
+In this ticket,
+- we make it possible to pass the codomain restrictions already when initializing the chart
+- we implement the pickling protocol using `__getstate__` and `__setstate__`
+- we implement the mutability protocol
egourgoulhon commented 3 years ago

Replying to @mkoeppe:

In this ticket,

  • we make it possible to pass the codomain restrictions already when initializing the chart

This would be nice! In the early stages of the manifold project, I did not manage to do it simply, i.e. without changing Sage's preparser, since

sage: D = Manifold(2, 'D')                                                                          
sage: X.<x,y> = D.chart(restrictions=x^2 + y^2 < 1)                                                 

would generate NameError: name 'y' is not defined. Of course, one could replace x^2 + y^2 < 1 by the string "x^2 + y^2 < 1", but this is not very user-friendly. Do you already have some solution in mind?

egourgoulhon commented 3 years ago
comment:3

Some thought: if UniqueRepresentation is abandoned for charts, it may be desirable to maintain WithEqualityById (one of the two mother classes of UniqueRepresentation) to have a fast equality operator. Indeed charts are heavily used as dictionary keys: for coordinates of manifold points, for coordinate expressions of scalar fields (the latter being used, among other things, to store the components of tensor fields), for coordinate expressions of continuous maps between manifolds (the keys being pairs of charts in that case), etc. Certainly WithEqualityById is the fastest equality operator and thus provides a fast access to the above dictionaries.

mkoeppe commented 3 years ago
comment:4

Replying to @egourgoulhon:

Replying to @mkoeppe:

sage: D = Manifold(2, 'D')                                                                          
sage: X.<x,y> = D.chart(restrictions=x^2 + y^2 < 1)                                                 

would generate NameError: name 'y' is not defined. Of course, one could replace x^2 + y^2 < 1 by the string "x^2 + y^2 < 1", but this is not very user-friendly. Do you already have some solution in mind?

Ah! That's of course a problem, I didn't realize this

mkoeppe commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,7 +1,6 @@
 Because of the method `add_restrictions` (which changes `self` by adding codomain restrictions), `Chart` needs to be considered as a mutable class. The current pickling based on `UniqueRepresentation` therefore cannot work correctly. 

 In this ticket,
-- we make it possible to pass the codomain restrictions already when initializing the chart
 - we implement the pickling protocol using `__getstate__` and `__setstate__`
 - we implement the mutability protocol
mkoeppe commented 3 years ago
comment:6

Maybe a chart should become immutable "upon first use"?

mkoeppe commented 3 years ago

Branch: u/mkoeppe/chartno_longer_use_uniquerepresentation_implementgetstate___setstate

mkoeppe commented 3 years ago

New commits:

5215e93Chart: WIP WithEqualityById instead of UniqueRepresentation; disambiguate _restrictions -> _coordinate_restrictions
mkoeppe commented 3 years ago

Description changed:

--- 
+++ 
@@ -4,3 +4,5 @@
 - we implement the pickling protocol using `__getstate__` and `__setstate__`
 - we implement the mutability protocol

+This is also preparation for #31894.
+
mkoeppe commented 3 years ago

Commit: 5215e93

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

facad97ContinuousMap.preimage: Handle identity_map specially
c2ecf3eScalarField.preimage: Handle the case of the zero scalar field
3b1c428ManifoldSubsetPullback: Make codomain_subset required 2nd init arg; fix pycodestyle/pyflakes warnings
d321b93ContinuousMap: Return domain if the map's codomain is contained in the given subset
07aba9eManifoldSubsetPullback.is_closed: Preimage of finite sets is closed
653c651src/sage/manifolds/subsets/pullback.py: Fix docstring markup
0f2e018Merge #31688
e3e38caChart.__init__: Move code that attaches self to the domain to TopologicalManifold.chart
3581c75Chart.function_ring: Cache via @cached_method, not UniqueRepresentation
869599dChart: WIP equality/immutability/pickling
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 5215e93 to 869599d

mkoeppe commented 3 years ago

Dependencies: #31688

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 869599d to d071d56

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d071d56TopologicalSubmanifold: Adjust to renamed chart attribute _coord_restrictions
mkoeppe commented 3 years ago
comment:12

The last obstacle is this code which tries to hash a chart when it is still mutable.

        # The null and one functions of the coordinates:
        # Expression in self of the zero and one scalar fields of open sets
        # containing the domain of self:
        for dom in self.open_supersets():
            dom._zero_scalar_field._express[chart] = chart.function_ring().zero()
            dom._one_scalar_field._express[chart] = chart.function_ring().one()
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from d071d56 to 7b822b4

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

7b822b4Fixup
mkoeppe commented 3 years ago
comment:14

Replying to @mkoeppe:

Maybe a chart should become immutable "upon first use"?

This is the approach that I have been trying out in this branch. But it does not work in examples like this one:

            sage: M = Manifold(2, 'M', structure='topological')
            sage: X.<x,y> = M.chart()
            sage: U = M.open_subset('U', coord_def={X: [x>1, y<pi]})

because X in coord_def is unhashable, leading to an error.

So perhaps an API change is needed, after all...

egourgoulhon commented 3 years ago
comment:15

In the current branch, Chart inherits from WithEqualityById but redefines __eq__ (in a rather complicated and possibly CPU time consuming way). Doesn't this defeat the purpose? In particular in view of comment:3.

In the code of __hash__, isn't

if not self.is_immutable():
    raise TypeError('Charts are unhashable until set_immutable() has been called')

an unnecessary limitation? More generally, why do you want to make Chart mutable? It is logically immutable: a chart is entirely defined by its domain and its coordinates (i.e. a tuple of symbolic variables). Coordinate restrictions, which can be added only after __init__ via add_restrictions in the current implementation, do not (mathematically) modify the object: there cannot be two charts with the same domain and the same coordinates, but with different coordinate restrictions.

egourgoulhon commented 3 years ago
comment:16

A possible option to get rid of the 2-stages construction of a chart (i.e. to get rid of the requirement of invoking add_restriction after __init__) could be to add the keyword argument restrictions to Chart.__init__, which can be either

sage: H = Manifold(2, 'H')  # half-disk
sage: X.<x,y> = H.chart(r"x y:(0,+oo)", restrictions="x^2 + y^2 < 1")

or equivalently

sage: X.<x,y> = H.chart(restrictions="[x^2 + y^2 < 1, y>0]")

Then we could get rid of the method add_restriction and consider that the chart is a fully immutable object (of course, technically speaking, the sheafy attributes can be muted).

mkoeppe commented 3 years ago
comment:17

Replying to @egourgoulhon:

In the current branch, Chart inherits from WithEqualityById but redefines __eq__ (in a rather complicated and possibly CPU time consuming way). Doesn't this defeat the purpose?

Yes, this part is not finished. I started with WithEqualityById as per your suggestion, but I wanted to implement actual pickling (not just passing the pickling-related testsuite) as a step to actual pickling of manifolds. Then the identity and unique-representation tricks are not sufficient any more, and one needs to implement full comparison.

One can still have a fast path for equality using id, and fast lookup in sets etc. via a fast hash.

mkoeppe commented 3 years ago
comment:18

Replying to @egourgoulhon:

More generally, why do you want to make Chart mutable?

I don't really, it was just an attempt to keep the current API (initialization, followed by add_restrictions) but implementing pickling.

mkoeppe commented 3 years ago
comment:19

When discussing a possible redesign of chart initialization, I think a relevant issue is the current use of the symbolic assumptions facility for the coordinates.

Are two charts with a different tuple of variables allowed to share some variables? If yes, does the shared variable have to have the same global assumptions and periodicity?

Also I note that creating a RealChart puts global assumptions on the variables of the chart. So creating a subchart will affect simplification of expressions that relate to computation on the original chart. This should probably be revised.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4b930ffMerge tag '9.4.beta4' into t/31688/pullbacks_of_manifold_subsets_under_continuous_maps
96e1fbfMerge branch 't/31688/pullbacks_of_manifold_subsets_under_continuous_maps' into t/31901/chart__no_longer_use_uniquerepresentation__implement___getstate_____setstate__
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 7b822b4 to 96e1fbf

egourgoulhon commented 3 years ago
comment:21

Replying to @mkoeppe:

When discussing a possible redesign of chart initialization, I think a relevant issue is the current use of the symbolic assumptions facility for the coordinates.

Are two charts with a different tuple of variables allowed to share some variables?

Yes, definitely. This is a desirable feature.

If yes, does the shared variable have to have the same global assumptions and periodicity?

No (see below).

Also I note that creating a RealChart puts global assumptions on the variables of the chart. So creating a subchart will affect simplification of expressions that relate to computation on the original chart. This should probably be revised.

Yes, absolutely. This is something I have in mind for a long time but did not find the time to work on it. The assumptions should not be global but chart-wise. The way Sage's symbolic calculus is implemented, I guess this means that they should be switched on (via assume) before each calculus at the chart function level and switched off once the calculus is done (via forget).

A connected question is how to pass these chart-based assumptions to SymPy, in case the latter has been chosen as the default symbolic backend on the manifold (via M.set_calculus_method('sympy')) or on a given chart X (via X.calculus_method().set('sympy')). For the time being, calculus with SymPy is possible but the assumptions are not taken into account on the SymPy side...

mkoeppe commented 3 years ago
comment:22

Replying to @egourgoulhon:

Replying to @mkoeppe:

When discussing a possible redesign of chart initialization, I think a relevant issue is the current use of the symbolic assumptions facility for the coordinates.

Are two charts with a different tuple of variables allowed to share some variables?

Yes, definitely. This is a desirable feature.

If yes, does the shared variable have to have the same global assumptions and periodicity?

No (see below).

OK. Then I think there is little point in trying to provide better syntax for creating the variables than asking the user to do var('x, y, z') before creating a chart with coordinate restrictions.

Then the restrictions can just be passed as symbolic inequalities to the constructor. (I don't think going the string route would be an improvement.)

We could deprecate add_restrictions then.

mkoeppe commented 3 years ago
comment:23

Replying to @egourgoulhon:

Replying to @mkoeppe:

Also I note that creating a RealChart puts global assumptions on the variables of the chart. [...]

Yes, absolutely. This is something I have in mind for a long time but did not find the time to work on it. The assumptions should not be global but chart-wise. The way Sage's symbolic calculus is implemented, I guess this means that they should be switched on (via assume) before each calculus at the chart function level and switched off once the calculus is done (via forget).

There is already a context manager assuming for that. We could create it at initialization and invoke it using with whenever computations are done with this chart.

The tricky bit is when two charts are involved with overlapping variables. Then some renaming may have to be done first. A possible route is through #32089.

mkoeppe commented 3 years ago
comment:24

I have opened #32102 (Chart: Add init argument coord_restrictions, deprecate method add_restrictions)

mkoeppe commented 3 years ago
comment:25

Replying to @egourgoulhon:

A connected question is how to pass these chart-based assumptions to SymPy, in case the latter has been chosen as the default symbolic backend on the manifold (via M.set_calculus_method('sympy')) or on a given chart X (via X.calculus_method().set('sympy')). For the time being, calculus with SymPy is possible but the assumptions are not taken into account on the SymPy side...

I suppose we can go through a new method assuming of CalculusMethod that dispatches in the same way as the simplify method does.

egourgoulhon commented 3 years ago
comment:26

Replying to @mkoeppe:

OK. Then I think there is little point in trying to provide better syntax for creating the variables than asking the user to do var('x, y, z') before creating a chart with coordinate restrictions.

The issue here is that on real manifolds (by far the most used ones), the symbolic variables are created with the extra parameter domain='real' (cf. RealChart._init_coordinates), so the user must actually do var('x y z', domain='real'), which is not very intuitive. For some reason, it is not equivalent to assume(x, 'real'), which is performed in RealChart._init_coordinates as well. Both proved to be necessary for some simplifications.

Then the restrictions can just be passed as symbolic inequalities to the constructor. (I don't think going the string route would be an improvement.)

I don't like strings either, but I would prefer this to a 2-stage declaration involving var. Anyway, let us continue the discussion on #32102.

We could deprecate add_restrictions then.

Yes.

egourgoulhon commented 3 years ago
comment:27

Replying to @mkoeppe:

Replying to @egourgoulhon:

Replying to @mkoeppe:

Also I note that creating a RealChart puts global assumptions on the variables of the chart. [...]

Yes, absolutely. This is something I have in mind for a long time but did not find the time to work on it. The assumptions should not be global but chart-wise. The way Sage's symbolic calculus is implemented, I guess this means that they should be switched on (via assume) before each calculus at the chart function level and switched off once the calculus is done (via forget).

There is already a context manager assuming for that. We could create it at initialization and invoke it using with whenever computations are done with this chart.

The tricky bit is when two charts are involved with overlapping variables. Then some renaming may have to be done first.

I don't see why: two computations on two distinct charts can set different assumptions on the same variable. There is no ambiguity as long as assumptions are tied to charts.

mkoeppe commented 3 years ago
comment:28

Replying to @egourgoulhon:

Replying to @mkoeppe:

When discussing a possible redesign of chart initialization, I think a relevant issue is the current use of the symbolic assumptions facility for the coordinates.

Are two charts with a different tuple of variables allowed to share some variables?

Yes, definitely. This is a desirable feature.

A follow-up question on this:

sage: M = Manifold(2, 'M', structure='topological')
....: U = M.open_subset('U')
....: V = M.open_subset('V')
sage: XU = U.chart(names=("x", "y"))
sage: XV = V.chart(names=("x", "y"))
sage: M.atlas()
[Chart (U, (x, y)), Chart (V, (x, y))]
sage: M.top_charts()
[Chart (U, (x, y))]

Clearly something went wrong here. Should there have been a declaration of something on M regarding the intention to declare charts with the coordinate pair (x,y) on subsets?

mkoeppe commented 3 years ago
comment:29

Replying to @egourgoulhon:

Replying to @mkoeppe: I think there is little point in trying to provide better syntax for creating the variables than asking the user to do var('x, y, z') before creating a chart with coordinate restrictions.

The issue here is that on real manifolds (by far the most used ones), the symbolic variables are created with the extra parameter domain='real' (cf. RealChart._init_coordinates), so the user must actually do var('x y z', domain='real'), which is not very intuitive.

How about something like M.var('x y z') then?

mkoeppe commented 3 years ago

Changed dependencies from #31688 to #31688, #32102

egourgoulhon commented 3 years ago
comment:31

Replying to @mkoeppe:

A follow-up question on this:

sage: M = Manifold(2, 'M', structure='topological')
....: U = M.open_subset('U')
....: V = M.open_subset('V')
sage: XU = U.chart(names=("x", "y"))
sage: XV = V.chart(names=("x", "y"))
sage: M.atlas()
[Chart (U, (x, y)), Chart (V, (x, y))]
sage: M.top_charts()
[Chart (U, (x, y))]

Clearly something went wrong here.

Good catch!

Should there have been a declaration of something on M regarding the intention to declare charts with the coordinate pair (x,y) on subsets?

No, the above is actually a bug in Chart.__init__. I've opened #32112 for it and submitted a fix there.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 96e1fbf to d2e3ff7

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4e316e9Fix bug in Chart.__init__ regarding the determination of top charts (Trac #32112)
d2e3ff7Merge #32112
mkoeppe commented 3 years ago

Changed dependencies from #31688, #32102 to #31688, #32112, #32102

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from d2e3ff7 to a328e91

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a328e91Merge #32112
mkoeppe commented 3 years ago

Work Issues: redo without mutability on top of #32102

mkoeppe commented 3 years ago
comment:36

Replying to @egourgoulhon:

creating a RealChart puts global assumptions on the variables of the chart. So creating a subchart will affect simplification of expressions that relate to computation on the original chart.

The assumptions should not be global but chart-wise.

I've opened #32120 (Chart-wise assumptions) for this.

mkoeppe commented 3 years ago
comment:37

Replying to @egourgoulhon:

Replying to @mkoeppe:

The tricky bit is when two charts are involved with overlapping variables. Then some renaming may have to be done first.

I don't see why: two computations on two distinct charts can set different assumptions on the same variable. There is no ambiguity as long as assumptions are tied to charts.

I was thinking of computations with coordinate changes here, which involve two charts simultaneously.

mkoeppe commented 3 years ago
comment:38

Replying to @egourgoulhon:

A connected question is how to pass these chart-based assumptions to SymPy, in case the latter has been chosen as the default symbolic backend on the manifold (via M.set_calculus_method('sympy')) or on a given chart X (via X.calculus_method().set('sympy')). For the time being, calculus with SymPy is possible but the assumptions are not taken into account on the SymPy side...

We already have a meta-ticket for this, #31958 (Use the SymPy assumptions facility). I was surprised that SymPy's assumptions are not really well connected to SymPy's sets (for which I have been building some glue code in #31926).

mkoeppe commented 3 years ago
comment:39

Given that #32102 is making mutability unnecessary, it seems we can keep UniqueRepresentation for charts for now (until we see another reason to drop it - such as unbounded memory) and pickle through that.

I'm setting the ticket to "sage-wishlist" for this reason.

egourgoulhon commented 3 years ago
comment:40

Replying to @mkoeppe:

Replying to @egourgoulhon:

Replying to @mkoeppe:

The tricky bit is when two charts are involved with overlapping variables. Then some renaming may have to be done first.

I don't see why: two computations on two distinct charts can set different assumptions on the same variable. There is no ambiguity as long as assumptions are tied to charts.

I was thinking of computations with coordinate changes here, which involve two charts simultaneously.

There should not by any issue with coordinate changes: a coordinate change X1 --> X2 involves only a set of chart functions based on X1 (stored as a MultiCoordFunction in the attribute CoordChange._transf). So only assumptions relative to X1 will be invoked in calculus involving the coordinate change X1 --> X2 (such as the Jacobian matrix). An exception is the computation of the inverse in CoordChange.inverse. There only assumptions on X2 should matter. However, if X1 and X2 share the same variables, the inverse is trivial. To summarize, I don't think there is a case where both sets of assumptions are required simultaneously in coordinate changes.

mkoeppe commented 3 years ago
comment:41

Thanks for the explanation, that's good news.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from a328e91 to 30a5d0b