rodluger / starry

Tools for mapping stars and planets.
https://starry.readthedocs.io
MIT License
142 stars 32 forks source link

Set a prior that maps are physical/non-negative? #300

Open eas342 opened 2 years ago

eas342 commented 2 years ago

Is your feature request related to a problem? Please describe. When fitting lightcurves to spherical harmonics in some experiments, significant fractions of the posterior maps are unphysical with minimum fluxesintensities that are negative.

Describe the solution you'd like It would be nice if there is a way to put in a prior that a map be positive. Is there a way to do this already? It looks like it could be done with pymc3 priors using Luger et al. 2018 equation 40 for ell=1 but maybe not so easily for higher orders. It looks like earlier starry versions had .is_physical() but I can't find that in newer starry versions.

Describe alternatives you've considered Perhaps one can look at the samples after the fact to see if they are physical with rendering on a course grid. Or go back to earlier versions of starry that use .is_physical() in emcee with the prior function.

Additional context

eas342 commented 2 years ago

I may have found a way to do this using Potential in pymc3 with the following lines inside the model.

import pymc3 as pm

b_map = starry.Map(ydeg=self.degree)

# Add another constraint that the map should be physical
map_evaluate = b_map.render(projection='rect',res=100)
## number of points that are less than zero
num_bad = pm.math.sum(pm.math.lt(map_evaluate,0))
## check if there are any "bad" points less than zero
badmap_check = pm.math.gt(num_bad, 0)
## Set log probability to negative infinity if there are bad points. Otherwise set to 0.
switch = pm.math.switch(badmap_check,-np.inf,0)
## Assign a potential to avoid these maps
nonneg_map = pm.Potential('nonneg_map', switch)
rodluger commented 2 years ago

@eas342 Sorry for my slow reply. In recent versions of starry there's a minimize method that returns the value of the minimum intensity across the map. You could simply check if that's less than zero and that would reproduce the behavior of is_physical in previous versions. But that's quite inefficient. A better way, similar to what you suggested, can sometimes be to actually sample in the pixel intensities themselves. Then you can specify a pm.Uniform prior on those pixels directly. This is likely faster and easier for NUTS to sample than the example you provided. To compute the flux, simply apply the spherical harmonic transform matrix A to vector of pixel intensities as follows:

    _, _, _, A, _, _ = map.get_pixel_transforms(oversample=2)
    npix = A.shape[1]
    p = pm.Uniform("p", lower=0.0, upper=1.0, shape=(npix,))
    y = tt.dot(A, p)

Examples of this can be found here: https://starry.readthedocs.io/en/latest/notebooks/PixelSampling/

Let me know if this helps!

eas342 commented 2 years ago

Ah, thanks for the tutorial on pixel sampling! Should have seen that earlier.

eas342 commented 2 years ago

How about enforcing that limb darkening is physical? I attempted a similar method as above with the potential:

## make sure that the limb darkening law is physical
is_physical = pm.math.eq(star_map.limbdark_is_physical(), 1)
switch = pm.math.switch(is_physical,-np.inf,0)
# Assign a potential to avoid these maps
physical_LD = pm.Potential('physical_ld', switch)

However, I get a NonImplementedError with theano version 1.1.2.

``` python --> 398 map_soln = pmx.optimize(start=start) 401 #map_soln = pmx.optimize(start=start, vars=[t0]) 402 # map_soln = pmx.optimize(start=start, vars=[ror]) 403 # # map_soln = pmx.optimize(start=map_soln, vars=[incl]) (...) 409 # # map_soln = pmx.optimize(start=map_soln, vars=[mean]) 410 # map_soln = pmx.optimize(start=map_soln) 412 resultDict['map_soln'] = map_soln File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/pymc3_ext/optim.py:91, in optimize(start, vars, return_info, verbose, progress, maxeval, model, **kwargs) 64 def optimize( 65 start=None, 66 vars=None, (...) 72 **kwargs 73 ): 74 """Maximize the log prob of a PyMC3 model using scipy 75 76 All extra arguments are passed directly to the ``scipy.optimize.minimize`` (...) 89 90 """ ---> 91 wrapper = ModelWrapper(start=start, vars=vars, model=model) 92 progressbar = start_optimizer( 93 maxeval, wrapper.vars, verbose=verbose, progress=progress 94 ) 96 # Count the number of function calls File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/pymc3_ext/optim.py:208, in ModelWrapper.__init__(self, start, vars, model) 206 # Pre-compile the theano model and gradient 207 nlp = -model.logpt --> 208 grad = theano.grad(nlp, vars, disconnected_inputs="ignore") 209 self.func = get_theano_function_for_var([nlp] + grad, model=model) File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:639, in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients) 636 if hasattr(g.type, "dtype"): 637 assert g.type.dtype in theano.tensor.float_dtypes --> 639 rval = _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name) 641 for i in range(len(rval)): 642 if isinstance(rval[i].type, NullType): File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1440, in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name) 1437 # end if cache miss 1438 return grad_dict[var] -> 1440 rval = [access_grad_cache(elem) for elem in wrt] 1442 return rval File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1440, in (.0) 1437 # end if cache miss 1438 return grad_dict[var] -> 1440 rval = [access_grad_cache(elem) for elem in wrt] 1442 return rval File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1393, in _populate_grad_dict..access_grad_cache(var) 1390 for node in node_to_idx: 1391 for idx in node_to_idx[node]: -> 1393 term = access_term_cache(node)[idx] 1395 if not isinstance(term, Variable): 1396 raise TypeError( 1397 f"{node.op}.grad returned {type(term)}, expected" 1398 " Variable instance." 1399 ) File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1061, in _populate_grad_dict..access_term_cache(node) 1057 if node not in term_dict: 1059 inputs = node.inputs -> 1061 output_grads = [access_grad_cache(var) for var in node.outputs] 1063 # list of bools indicating if each output is connected to the cost 1064 outputs_connected = [ 1065 not isinstance(g.type, DisconnectedType) for g in output_grads 1066 ] File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1061, in (.0) 1057 if node not in term_dict: 1059 inputs = node.inputs -> 1061 output_grads = [access_grad_cache(var) for var in node.outputs] 1063 # list of bools indicating if each output is connected to the cost 1064 outputs_connected = [ 1065 not isinstance(g.type, DisconnectedType) for g in output_grads 1066 ] File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1393, in _populate_grad_dict..access_grad_cache(var) 1390 for node in node_to_idx: 1391 for idx in node_to_idx[node]: -> 1393 term = access_term_cache(node)[idx] 1395 if not isinstance(term, Variable): 1396 raise TypeError( 1397 f"{node.op}.grad returned {type(term)}, expected" 1398 " Variable instance." 1399 ) File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1061, in _populate_grad_dict..access_term_cache(node) 1057 if node not in term_dict: 1059 inputs = node.inputs -> 1061 output_grads = [access_grad_cache(var) for var in node.outputs] 1063 # list of bools indicating if each output is connected to the cost 1064 outputs_connected = [ 1065 not isinstance(g.type, DisconnectedType) for g in output_grads 1066 ] File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1061, in (.0) 1057 if node not in term_dict: 1059 inputs = node.inputs -> 1061 output_grads = [access_grad_cache(var) for var in node.outputs] 1063 # list of bools indicating if each output is connected to the cost 1064 outputs_connected = [ 1065 not isinstance(g.type, DisconnectedType) for g in output_grads 1066 ] File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1393, in _populate_grad_dict..access_grad_cache(var) 1390 for node in node_to_idx: 1391 for idx in node_to_idx[node]: -> 1393 term = access_term_cache(node)[idx] 1395 if not isinstance(term, Variable): 1396 raise TypeError( 1397 f"{node.op}.grad returned {type(term)}, expected" 1398 " Variable instance." 1399 ) File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/gradient.py:1220, in _populate_grad_dict..access_term_cache(node) 1212 if o_shape != g_shape: 1213 raise ValueError( 1214 "Got a gradient of shape " 1215 + str(o_shape) 1216 + " on an output of shape " 1217 + str(g_shape) 1218 ) -> 1220 input_grads = node.op.L_op(inputs, node.outputs, new_output_grads) 1222 if input_grads is None: 1223 raise TypeError( 1224 f"{node.op}.grad returned NoneType, expected iterable." 1225 ) File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/graph/op.py:325, in Op.L_op(self, inputs, outputs, output_grads) 301 def L_op( 302 self, 303 inputs: List[Variable], 304 outputs: List[Variable], 305 output_grads: List[Variable], 306 ) -> List[Variable]: 307 r"""Construct a graph for the L-operator. 308 309 This method is primarily used by `tensor.Lop` and dispatches to (...) 323 324 """ --> 325 return self.grad(inputs, output_grads) File ~/miniconda3/envs/pymc3_env/lib/python3.9/site-packages/theano/graph/op.py:299, in Op.grad(self, inputs, output_grads) 275 def grad( 276 self, inputs: List[Variable], output_grads: List[Variable] 277 ) -> List[Variable]: 278 """Construct a graph for the gradient with respect to each input variable. 279 280 Each returned `Variable` represents the gradient with respect to that (...) 297 298 """ --> 299 raise NotImplementedError() NotImplementedError: ```
dfm commented 2 years ago

The conceptual problem with this approach is that PyMC needs to work on a differentiable logp function, with support everywhere: (-inf, inf) in all the parameters. This prior doesn't have that behavior since it introduces a part of parameter space with zero support. I'm not entirely sure where the theano issue is coming from, but it's probably not high priority to track down. I know that @rodluger and I have chatted about ideas for generally physical limb darkening parameterizations, but I don't remember if he came up with a robust proposal!

eas342 commented 2 years ago

Thank you @dfm. I found that this switch function worked decently for eclipse mapping to constrain the spherical harmonic coefficients to give a nonnegative map and got similar (and in some cases better) results as pixel sampling. Maybe it was violating some assumptions of PyMC and I just got lucky.

Perhaps someone has worked out higher dimensional sampling in physical space like the Kipping 2013 re-parameterization. In the conclusions it says "when N parameters are mutually constrained by N + 1 non-parallel boundary conditions leading to tetrahedral sampling and hyper-tetrahedral sampling."

dfm commented 2 years ago

Working out that reparameterization would definitely be the best way to go about this, if possible. I haven't looked into this, but it would be cool to see!

And yeah, unfortunately those kinds of -inf log priors are going to break all sorts of things giving incorrect results and poor performance unless you get very lucky!

mark-hammond commented 1 year ago

Should the solutions described above still work when the map coefficients and amplitude are marginalised, like in https://starry.readthedocs.io/en/latest/notebooks/EclipsingBinary_FullSolution/ ?

I have applied a positive pixel constraint to the example in https://starry.readthedocs.io/en/latest/notebooks/EclipsingBinary_PyMC3/ (following the example in https://starry.readthedocs.io/en/latest/notebooks/PixelSampling/), which is doable because sys.flux is explicitly included in the pymc3 model in that case.

However, for the case in https://starry.readthedocs.io/en/latest/notebooks/EclipsingBinary_FullSolution/ the flux isn't explicitly accessible and so you can't plug in a prior on the pixels in the same way as before.

Is there another way of applying the positive map prior in this case? Thanks for any help!