quartiq / rayopt

Python optics and lens design, raytracing
GNU Lesser General Public License v3.0
254 stars 50 forks source link

Optimizing a Doublet: unclear how some parameters work #34

Open polyfractal opened 4 years ago

polyfractal commented 4 years ago

Hi all! First, thanks for creating RayOpt, it looks like a wonderful library! I'm looking forward to playing around with it.

I had a few questions around how the optimizer works. Full disclosure, I'm a relative newbie at optics in general, so some of my confusion might be stemming from that :)

I grabbed the optimization code from the triplet example and tried to simplify it down to a long, air-spaced doublet. After playing around with parameters for a while, I got it to a point where it "works" (ok'ish spot diagram, etc)... but I'm not sure what all the various components actually do so it's hard to say how well it is working :) I'm happy to polish this up into an example optimizer demo for the notebook repo, complete with comments

My full script is here, and a few questions about how it works below:

variables.extend(PathVariable(s, (i, "curvature"), (-1/10, 1/10))
                 for i in (1, 2, 3, 4))

How do the "bounds" work here? The docs for scipy.minimize suggest this is the extent the variables can reach, but in practice this seems to be something closer to step size during optimization? E.g. the ROC of a surface is not limited to [-0.1,0.1] when using the above parameters

def get(s):
    s.update()
    s.paraxial.focal_length_solve(500)
    s.paraxial.propagate(start=-2)
    s.paraxial.refocus()
    s.paraxial.propagate(start=-1)
    s.paraxial.resize()
    s.paraxial.aberrations()

In the get() function, what is the purpose of s.paraxial.propagate(start=-2) and s.paraxial.propagate(start=-1)? Skimming the source, it seems these might be the surfaces to start propogating from... but I don't understand why negative numbers are being used? I suspect I'm entirely incorrect about what these are for.

More generally, is there a high-level explanation of why the sequences is focal-length -> prop -> refocous -> prop -> resize -> aberr ?

operands = [
    FuncOp(s, get),
    #FuncOp(s, lambda s: s[-1].offset[2:], min=60),
    #FuncOp(s, lambda s: s.edge_thickness()[1:], min=2),
    #FuncOp(s, lambda s: np.diff(s.track), min=2),
    FuncOp(s, lambda s: s.paraxial.transverse3[:, 5:].sum(0)/.5, min=-1, max=1),
    FuncOp(s, lambda s: s.paraxial.transverse3[:, 3:5].sum(0)/1, min=-1, max=1),
    FuncOp(s, lambda s: s.paraxial.transverse3[:, :3].sum(0)/.03, min=-1, max=1),
    #FuncOp(s, lambda s: s.paraxial.transverse3[:, (0, 1, 2, 5, 6)].sum(0), weight=1),
    GeomOp(s, rays=16, weight=1),
    #PolyOp(s, weight=1),
]

My understanding is that the operands define the function that we are trying to minimize. So the get() ensures we are hitting the correct focal length, the commented-out lines for offset and thickness can be used to control details about the surfaces, the s.paraxial.transverse3 conditions have something to do with minimizing aberrations and spot size, and not sure what the GeomOp does.

Is there an easy explanation what the s.paraxial.transverse3 conditions do? I tried to dig around in the code but wasn't able to figure it out. Similarly, I'm not sure what the commented out sum(0) condition does, or the PolyOp

Thanks for the help! As I said, happy to take this and help put together an "optimizer 101" example for other clueless newbies like myself :)

jordens commented 4 years ago

I had a few questions around how the optimizer works. Full disclosure, I'm a relative newbie at optics in general, so some of my confusion might be stemming from that :)

I'm sorry to disappoint you. The optimization code is very raw. I don't think it ever went beyond playing with it on one or two designs. And it never yielded I design that I would have called "optimal" or one that materialized, AFAIK. There are probably many bugs and errors in there. And maybe the overall design of the optimization is also bad.

I grabbed the optimization code from the triplet example and tried to simplify it down to a long, air-spaced doublet. After playing around with parameters for a while, I got it to a point where it "works" (ok'ish spot diagram, etc)... but I'm not sure what all the various components actually do so it's hard to say how well it is working :) I'm happy to polish this up into an example optimizer demo for the notebook repo, complete with comments

Please feel free to do that. But you may have to dig into the code a lot to convince yourself that it is correct.

How do the "bounds" work here? The docs for scipy.minimize suggest this is the extent the variables can reach, but in practice this seems to be something closer to step size during optimization? E.g. the ROC of a surface is not limited to [-0.1,0.1] when using the above parameters

There are certainly pitfalls with bounds, how they are implemented and whether they mean that the result will obey them or whether each function evaluation will also obey them. And it is likely that the bounds also affect step size. inf as bound might also be a bad idea (as compared to None). But I don't think that the main impact of bounds is the step size.

def get(s):
    s.update()
    s.paraxial.focal_length_solve(500)
    s.paraxial.propagate(start=-2)
    s.paraxial.refocus()
    s.paraxial.propagate(start=-1)
    s.paraxial.resize()
    s.paraxial.aberrations()

In the get() function, what is the purpose of s.paraxial.propagate(start=-2) and s.paraxial.propagate(start=-1)? Skimming the source, it seems these might be the surfaces to start propogating from... but I don't understand why negative numbers are being used? I suspect I'm entirely incorrect about what these are for.

More generally, is there a high-level explanation of why the sequences is focal-length -> prop -> refocous -> prop -> resize -> aberr ?

get() updates the system after setting the current variables. In this case:

  1. do some miscellaneous housekeeping, including a paraxial ray trace
  2. solve the penultimate surface to achieve the desired focal length
  3. then you need to re-trace the paraxial trace from that penultimate (index -2) surface on
  4. move the last surface into focus
  5. retrace the last (index -1) surface
  6. resize the elements so that the paraxial rays are not clipped
  7. update abberation coefficients
operands = [
    FuncOp(s, get),
    #FuncOp(s, lambda s: s[-1].offset[2:], min=60),
    #FuncOp(s, lambda s: s.edge_thickness()[1:], min=2),
    #FuncOp(s, lambda s: np.diff(s.track), min=2),
    FuncOp(s, lambda s: s.paraxial.transverse3[:, 5:].sum(0)/.5, min=-1, max=1),
    FuncOp(s, lambda s: s.paraxial.transverse3[:, 3:5].sum(0)/1, min=-1, max=1),
    FuncOp(s, lambda s: s.paraxial.transverse3[:, :3].sum(0)/.03, min=-1, max=1),
    #FuncOp(s, lambda s: s.paraxial.transverse3[:, (0, 1, 2, 5, 6)].sum(0), weight=1),
    GeomOp(s, rays=16, weight=1),
    #PolyOp(s, weight=1),
]

My understanding is that the operands define the function that we are trying to minimize. So the get() ensures we are hitting the correct focal length, the commented-out lines for offset and thickness can be used to control details about the surfaces, the s.paraxial.transverse3 conditions have something to do with minimizing aberrations and spot size, and not sure what the GeomOp does.

Is there an easy explanation what the s.paraxial.transverse3 conditions do? I tried to dig around in the code but wasn't able to figure it out. Similarly, I'm not sure what the commented out sum(0) condition does, or the PolyOp

paraxial.transverse3 are the primary (third order) transverse abberation coefficients. See one of the books in the literature section of the README for their meaing and definition.

The primary aberrations are additive (each surface contributes linearly). sum(0) is the sum of the aberration coefficients over the first dimension (the surface index).

They all only require the ParaxialTrace that's intrinsic to each system (System.paraxial).

PolyOp is an operand based on arbitrary orders of aberrations. The math is described in the papers referenced in PolyTrace which is a raytrace based on those multivariate polynomials (in field and pupil coordinates) of arbitrary order. The theory behind it is compelling since it properly separates higher order aberrations which are increasingly hard to correct.

GeomOp is an operand that uses a GeometricTrace to compute ray geometric ray positions and would minimize geometric spot size.

Thanks for the help! As I said, happy to take this and help put together an "optimizer 101" example for other clueless newbies like myself :)

I suspect that the optimizer itself needs work still before such an example would be possible. And that work would probably also require someone with more experience in designing and optimizing lenses than I have.

In general lens optimization is an ill-posed problem (and this code may well be a mostly useless attempt). Way to many parameters and constraints, many and non-linear bounds, discrete variables for the material choice. Would also require global optimization. This is why I looked into ASA as well at some point. But Bayesian (#21) optimization might be much better suited in fact.

polyfractal commented 4 years ago

Thanks for the quick and comprehensive reply! Soaking it in and doing some reading on parts I don't understand. Appreciate the help, it's fun to tinker around with this but I'm clearly in over my head heh :)

paraxial.transverse3 are the primary (third order) transverse abberation coefficients.

Ah, ok, so [spherical aberration, coma, astigmatism, petzval, distortion], in that order? Does that mean the above snippet translates to:

s.paraxial.transverse3[:, 5:].sum(0)/.5   # distortion
s.paraxial.transverse3[:, 3:5].sum(0)/1   # astig,field curv,distortion
s.paraxial.transverse3[:, :3].sum(0)/.03  # spherical, coma, astig

?

But Bayesian (#21) optimization might be much better suited in fact.

So I was intrigued by this library, and had a go at implementing it "over" Rayopt (tried to integrate it into the library, but python isn't my primary language so ran into some difficulties). The gist is here, although I've tweaked it a bunch since making that gist trying to optimize the optimizer. I ran it over some doublets, and also some triplets to see how it would fare. Definitely sensitive to starting parameters and exploration vs exploitation settings, but seems to do a decent enough job :)

I did notice that it tends to minimize the green wavelength, but not so much red/blue. I suspect that means something is wrong with my GeomOp? I saw that ray_point accepts a wavelength parameter, so I tried to include that and shoot rays for r/g/b and see those to influence the score, but I haven't noticed a difference. E.g. I tried shooting r/g/b, as well as just shooting one wavelength. Didn't seem to have a noticeable impact... I suspect it isn't doing what I think it is doing :)

class GeomOp(Operand):
    def __init__(self, system, rays=13, *args, **kwargs):
        super(GeomOp, self).__init__(system, *args, **kwargs)
        self.t_r = [GeometricTrace(self.system) for f in self.system.fields]
        self.t_g = [GeometricTrace(self.system) for f in self.system.fields]
        self.t_b = [GeometricTrace(self.system) for f in self.system.fields]
        self.rays = rays

    def get(self):
        for t_r, t_g, t_b, f in zip(self.t_r, self.t_g, self.t_b,  self.system.fields):        
            t_r.rays_point((0, f), wavelength=656.e-9, nrays=self.rays,
                         distribution="radau",
                         clip=False, filter=False)
            t_g.rays_point((0, f), wavelength=546.1e-9, nrays=self.rays,
                         distribution="radau",
                         clip=False, filter=False)
            t_b.rays_point((0, f), wavelength=455e-9, nrays=self.rays,
                         distribution="radau",
                         clip=False, filter=False)

        v_r = np.concatenate([
                (t.y[-1, 1:, :2] - t.y[-1, t.ref, :2])/np.sqrt(t.y.shape[1])
                for t in self.t_r])
        v_g = np.concatenate([
                (t.y[-1, 1:, :2] - t.y[-1, t.ref, :2])/np.sqrt(t.y.shape[1])
                for t in self.t_g])
        v_b = np.concatenate([
                (t.y[-1, 1:, :2] - t.y[-1, t.ref, :2])/np.sqrt(t.y.shape[1])
                for t in self.t_b])

        v = np.concatenate((v_r, v_g, v_b))
        return np.where(np.isfinite(v), v, 10).ravel()

Some charts from a triplet. This was an overnight run with pretty heavy exploration so the green spot is basically perfect :laughing: image

And from a quicker run where I was trying to optimize a doublet:

image

mjhoptics commented 4 years ago

You have uncorrected axial color. This can only be fixed by picking the correct combination of glass materials. This is done by choosing a crown (eg Schott N-BK7) and a flint (N-F2). The larger the difference in glass V-number (a measure of dispersion) the better. Once you get axial color under control, you’ll be left with secondary color, which requires special glass types (meaning expensive) to correct. Hope this helps. Mike Hayford

Sent from my iPad

On Jun 5, 2020, at 2:56 PM, Zachary Tong notifications@github.com wrote:

 Thanks for the quick and comprehensive reply! Soaking it in and doing some reading on parts I don't understand. Appreciate the help, it's fun to tinker around with this but I'm clearly in over my head heh :)

paraxial.transverse3 are the primary (third order) transverse abberation coefficients.

Ah, ok, so [spherical aberration, coma, astigmatism, petzval, distortion], in that order? Does that mean the above snippet translates to:

s.paraxial.transverse3[:, 5:].sum(0)/.5 # distortion s.paraxial.transverse3[:, 3:5].sum(0)/1 # astig,field curv,distortion s.paraxial.transverse3[:, :3].sum(0)/.03 # spherical, coma, astig ?

But Bayesian (#21) optimization might be much better suited in fact.

So I was intrigued by this library, and had a go at implementing it "over" Rayopt (tried to integrate it into the library, but python isn't my primary language so ran into some difficulties). The gist is here, although I've tweaked it a bunch since making that gist trying to optimize the optimizer. I ran it over some doublets, and also some triplets to see how it would fare. Definitely sensitive to starting parameters and exploration vs exploitation settings, but seems to do a decent enough job :)

I did notice that it tends to minimize the green wavelength, but not so much red/blue. I suspect that means something is wrong with my GeomOp? I saw that ray_point accepts a wavelength parameter, so I tried to include that and shoot rays for r/g/b and see those to influence the score, but I haven't noticed a difference. E.g. I tried shooting r/g/b, as well as just shooting one wavelength. Didn't seem to have a noticeable impact... I suspect it isn't doing what I think it is doing :)

class GeomOp(Operand): def init(self, system, rays=13, *args, *kwargs): super(GeomOp, self).init(system, args, **kwargs) self.t_r = [GeometricTrace(self.system) for f in self.system.fields] self.t_g = [GeometricTrace(self.system) for f in self.system.fields] self.t_b = [GeometricTrace(self.system) for f in self.system.fields] self.rays = rays

def get(self):
    for t_r, t_g, t_b, f in zip(self.t_r, self.t_g, self.t_b,  self.system.fields):        
        t_r.rays_point((0, f), wavelength=656.e-9, nrays=self.rays,
                     distribution="radau",
                     clip=False, filter=False)
        t_g.rays_point((0, f), wavelength=546.1e-9, nrays=self.rays,
                     distribution="radau",
                     clip=False, filter=False)
        t_b.rays_point((0, f), wavelength=455e-9, nrays=self.rays,
                     distribution="radau",
                     clip=False, filter=False)

    v_r = np.concatenate([
            (t.y[-1, 1:, :2] - t.y[-1, t.ref, :2])/np.sqrt(t.y.shape[1])
            for t in self.t_r])
    v_g = np.concatenate([
            (t.y[-1, 1:, :2] - t.y[-1, t.ref, :2])/np.sqrt(t.y.shape[1])
            for t in self.t_g])
    v_b = np.concatenate([
            (t.y[-1, 1:, :2] - t.y[-1, t.ref, :2])/np.sqrt(t.y.shape[1])
            for t in self.t_b])

    v = np.concatenate((v_r, v_g, v_b))
    return np.where(np.isfinite(v), v, 10).ravel()

Some charts from a triplet. This was an overnight run with pretty heavy exploration so the green spot is basically perfect 😆

And from a quicker run where I was trying to optimize a doublet:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

polyfractal commented 4 years ago

Hi @mjhoptics, thanks! That's my fault, I forgot to mention that the materials are set to PMMA (n = ~1.49) and Polycarbonate (n = ~1.58)... this was spawned by playing with plastic lenses hence acrylic/polycarbonate. I did try BK7/F2 earlier and had similar issues.

But! I think I might have spotted some bugs in my code, due to unfamiliarity with python. Will play around and see if I can suss out the issue :)