problem in LIME raytracer

migueldvb commented 6 years ago

In our research group we have identified a problem in the LIME raytracer causing the central pixel fluxes in a ~1/r^2 distribution to be overestimated. For the same model with different pixel sizes but the same physical scale the fluxes inside the central 0.2'' square region differ considerably (calculated by the sum statistic): the image with finer pixel sampling has 3x less flux than the other one. This could explain the discrepancy we're seeing in the visibility curves when comparing LIME results with ALMA observations. In order to solve this issue we need to modify the ray tracer to allow finer sampling in the central pixels and tried the two alternatives below without success.

There is an antialias option in the documentation, which performs stochastic supersampling. We have tried with values of 10 and 100, but this did not fully resolve the problem. It seems that this algorithm is currently not implemented in the raytracing code.

We have also modified the value of maxNumRaysPerPixel in the raytrace function from 20 to 0 to indicate that there is no maximum number of rays per pixel. However this did not solve the problem either.

mcordiner commented 6 years ago

The problem arises when there is a strongly varying number density inside a given pixel (in this case, the central pixel(s). Using log(r) grid sampling results in a much higher density of grid points for the central, high-density regions. Given that the LIME raytracer sampling is chosen based on the cell density (one for each cell, up to the limit given by maxNumRaysPerPixel), the resulting average flux in the central pixel becomes biased by the high-density region towards r=0.

I wrote a fix for raytrace.c that implements a regular cartesian sampling of the central pixel(s) (similar to what RATRAN can do), which seems to solve the problem. I'd be happy to share this fix if anyone wants to try it.

allegroLeiden commented 6 years ago

Sorry not to reply to this until now. LIME support in the immediate future is problematic. I'm nearly at the end of my contract at Leiden and have had to devote my time to other things for some time. I am about to spend a few weeks on LIME, to try to fix some of the outstanding issues and to finalize the whole python/CASA interface additions. @mcordiner that sounds like a useful alternative but priority for me is to address #261 and then merge my python PR.

allegroLeiden commented 6 years ago

I've wondered for some time if the 'roughness' visible in the central pixels means that the sampling is too sparse there. Traditional LIME used a cockamamie algorithm bodged up to avoid the slowness of the rejection method of choosing random point locations to follow a prescribed density function. This could easily result in undersampling in the centre and is also poorly adapted to models which are not dominated by a central density maximum. par->samplingAlgorithm=1 uses my alternate algorithm, which is fast but which also follows the prescribed point density fairly exactly; however there is no guarantee either that this gives a good sampling density at all places. The only way to really decide on this is if one could look at the maximum optical depth (MOD) per link/edge between grid points. Ideally this ought to be of order unity. MOD<<1 means the grid points are denser than they need to be; MOD>>1 means they are too sparse. I have had plans to write a diagnostic for this but this has run against the same wall as with the rest of my wish list for LIME: lack of available man-hours.

mcordiner commented 6 years ago

In my opinion, the new raytracing algorithm could be improved by taking into account the physical (sky-projected) area associated with each ray, and then weighting the resulting pixel flux accordingly. To implement this (assuming a one-ray-per-grid-cell approach), you'd just need to know the sky-projected area of each cell.

Also, I guess it wouldn't be that hard to implement an option for RATRAN-style raytracing, that should give good results for the case of models dominated by a single maximum at the image centre, and allow easier benchmarking against that code.

allegroLeiden commented 6 years ago

I just had a deeper look at this and I confess I am a bit puzzled. What happens in raytrace is as follows:

All internal grid points are projected to the image plane. (Some others are added on the locus of the circular projected model rim.) This forms the set of rays.
Each ray is followed through the model and the flux at each channel calculated for it.
Each ray's flux is added to the image pixel the ray passes through. (For every channel.) The tally of the rays passing through each pixel is kept.
Pixels with more than 1 ray have their cumulative values divided by the number of rays passing through that pixel. I.e., this is just an average value. All other pixels have their values obtained by interpolation.

The average performed is I suppose a biased one, but if the model properties vary markedly within a pixel then any averaging scheme (e.g. sampling on a cartesian grid within the pixel, or ditto random- or quasi-random sampling) will be biased in that it will not reflect such variations well.

Mathematically one should not expect that the flux found by summing all pixels should be independent of the pixel sizes. A trivial inequality illustrating why not is

(a+b+c+d+e)/5 != [(a+b+c)/3 + (d+e)/2]/2

The code which implements the algorithm is not very safe, in that the tally of rays per pixel and the summing of flux are performed in different places. I'll make this more robust and see if that makes a difference.

The thing I notice, at least with the example model, is not so much over-estimation as the relatively non-uniform nature of the solution across the image. This leads to a 'speckled' appearance of the central pixels, i.e. those for which averaging is performed. I can't see any solution to this apart from increasing the number of grid points. I'll try this as well.

allegroLeiden commented 6 years ago

BTW @migueldvb you should have set maxNumRaysPerPixel to a value <0 to flag the desire for no upper bound.

allegroLeiden commented 6 years ago

I upload 4 images, which were made as follows:

a1: standard model.
a2: standard model, image at twice the resolution.
b1: standard model, but with 8 times as many grid points (thus average edge lengths half that of standard model).
b2: same as b1 but with twice the resolution.

All are displayed with the same brightness scale. All show the +500 m/s channel.

image_a1 image_a2 image_b1 image_b2

allegroLeiden commented 6 years ago

My take-home:

The in-pixel sampling gives quite a reasonable representation of the actual solution, in that there are no marked or unexpected results from doubling the resolution.
Halving the average distance between grid points does, as expected, remove a great deal of the spatial inhomogeneity of the solution.

So I can't really see an issue. If any of you have a model which displays one, I'd be happy to take a look at it.

mcordiner commented 6 years ago

I agree that the current raytracing implementation works well in many cases, as long as there aren't any sharp features (containing lots of grid points) with sizes much less than a single image pixel. Your above test images appear quite smoothly varying. However, in our problem case, there is a strong density increase towards the image center, that is well sampled by the logarithmic radial grid, but poorly sampled in terms of image pixels. This is actually for a model of a solar system body, so the size of the central density enhancement is ~ 1km, however, I can imagine protostellar envelopes with ~1/r*2 density profiles where a similar situation could occur, but of course, on a much larger size scale (and at at similarly greater distance). It's impractical for us to have images 10^5x10^5 pixels in size, so the central regions need a robust supersampling algorithm, or else their fluxes get severely overestimated.

The attached image shows a zoom of the central portion of the model, and the statistics are taken inside the magenta box. The box is the same physical size (in arcseconds) in both images, but the image on the right has 10 times smaller pixels sizes. The region on the left has a total flux (sum) 1.4 times the total flux of the image on the right, which is quite a severe discrepancy. I can email the model file if you're interested in having a look.

figure_1

allegroLeiden commented 6 years ago

Ok I think I have a better appreciation of the problem now. You want to have more accurate flux estimates, which I guess is only reasonable. I can think of 3 candidate algorithms for improving matters, 2 suggested by @mcordiner and 1 which occurred to me:

Cartesian grid.
Weighting by cell area.
2^D-tree sampling.

The 1st is probably the simplest to implement, but with this one would risk undersampling the flux, since it is easy to imagine a situation in which a sharp maximum falls between points. However the point about comparing to RATRAN is well made.

Weighting by cell area: these would have to be Voronoi cells. I guess we can get these without too much extra graft since qhull has to be called anyway for the sparse-pixel interpolation, although we get the Delaunay duals for this. Personally I am very leery of having anything more to do with qhull whatsoever, but since we already have exposure to this package, I suppose in for a penny, in for a pound. However, another objection to this scheme is that flux does not necessarily follow grid point density.

The 2^D-tree thing is already coded up in Lime (in module tree_random.c) as an alternative to the original manky hard-wired grid-and-smooth algorithm for selecting grid points. I tried to make it general, not hard-wired for D=3. It can be used not only to quickly generate random points following an arbitrary distribution but in principle also to estimate the integral of that distribution. This method would probably provide the most accurate fluxes per pixel of the three, although decent accuracy would probably require more samples (thus running time) than most people are willing to afford. In other words it may be too much gun for the bear.

Now the crunch is that I am very reluctant to do any more substantial work on Lime. I've just merged a giant PR with fixes to the CASA interface, and since that was what they mostly hired me to do, and since I have had very little time to pursue my own research interests during the last 3.5 years, I am looking forward now to at last being able to give them some attention during the small amount of time I have left in this contract. So if one of you wants to submit a PR for a changed algorithm in the way you desire, I'll have a look at it, and merge it if the code meets reasonable standards. That's about the fairest I can say.

christianbrinch commented 6 years ago

A historical note: LIME used to have a RATRAN style raytracer in the early versions. In fact, the raytracer was a simple translation of Michiels SKY code. It worked mostly fine, but we encountered the exact same problem as the one described here, namely, that the center pixel(s) were off in some cases. SKY improved on this by introducing oversampling, which helped but did not solve the issue. Kees Dullemond had experienced the same issue with the raytracer in RADMC-3D and he tried to solve it with a clever subsampling scheme, which unfortunately could become extremely slow. Kees and I devised a solution which uses gridpoint based raytracing rather than the old raster based raytracing. A variation of this method (which is described in this paper: http://adsabs.harvard.edu/abs/2014MNRAS.440.3285B, see fig 6 for an illustration of the problem: there is a hole in the center of the raster image of the disk because the four central rays miss the peak.) was implemented in LIME some 5-6 years ago by me and iirc later on also in RADMC-3D by Kees. It works quite well, so I don't know why Michiel decided to go back to the old raster method. It is slower, it does not conserve the flux properly and it has aliasing issues.

It is true that grid point based raytracing rely critically on proper sampling of the model. However, it will give you an exact mapping of the flux associated with the sampling of the model - pixel based raytracing will not. The idea is that if you sample the model properly, the flux will only change linearly between gridpoints and therefore, by triangulating the rays, it is possible to interpolate the flux between the rays (not generally possible for pixels) and achieve arbitrary image resolution, even with relatively few rays.

mcordiner commented 6 years ago

To me, a decent approach would be to continue with the current algorithm, but do a weighted average over the rays per pixel based on the projected area of each grid cell (instead of the current, unweighted average). However, I'm not at all familiar with what Qhull can do. If it can return the sky-projected cell areas that would be ideal, or alternatively, we can calculate them using the actual cell areas and their normal vectors, if those are available?

Christian - which version of LIME implements the Brinch and Dullemond method? If it works well, surely it would be a good idea to make it available again (perhaps as an option in the latest version).

christianbrinch commented 6 years ago

Certainly 1.4 and 1.5. Not sure about 1.6. Probably not.

allegroLeiden commented 6 years ago

Direct generation of visibilities a la Brinch and Dullemond is a good idea, but AFAIR this was never fully implemented in LIME.

allegroLeiden commented 6 years ago

Just looked at the code. In the version of 1.4 which I got from Jes Jorgensen, a sky-plane grid image is produced. There's no direct generation of visibilities. The algorithm was essentially as follows:

Rays are passed through each grid point and projected onto the image plane.
Qhull is called a second time, in 2D, for these projected points.
A sky-plane i.e. orthogonal gridded image is constructed via what looks like an interpolation scheme.

This algorithm was taken offline in version 1.5, although the code was kept, since it seemed a useful approach which however needed some work. A version of the algorithm, with simpler and more transparent interpolation, and with separate treatment of low-density vs. high-density pixels, was reinstated in 1.6. This has been retained with only minor changes since that time.

christianbrinch commented 6 years ago

We never actually implemented the visibility generation part. However, the algorithm worked as following (if my memory serves me correctly):

Choose 1 ray per grid point projected onto the image plane. This ensures that many rays are traced through high density regions (which is often the center) and few through low density regions.
- Lloyd smooth the distribution of rays, ie. rays do not pass exactly through each grid point, but the distribution of rays follow the distribution of points.
- Ray trace. It is fast because because you trace 25-50 thousand rays (typical number of grid points in a LIME model, whereas for a 512^2 pixel image with 4 times oversampling you hit 1 million rays
- Qhull is called to form the Delaunay graph to give you the triangular supported patches that the Brinch and Dullemond scheme works on. Under the (good) assumption that flux only changes linearly over these patches, you now have an analytic expression for the flux in the image plane.
- In order to get a FITS file (which after all is what most people want) you need to map the Delaunay graph onto a raster image. There are probably many ways in which this can be done. I implemented two different methods (although only one was offered to the user). One of these is Sheppards method which is an inverse distance weighting scheme, akin to the way one would map SPH data onto a regular grid. This works very nicely and gives very smooth looking images, but it is slow, so I opted not to use it. The other method is just a simple interpolation scheme. Iirc I just sampled the Delaunay patches with the pixel center. One could probably add some sort of anti-aliasing scheme here with very limited performance penalty.

Moreover, it is easy to assess the error one makes by mapping the fluxes onto the pixel grid. Integrating the Delaunay graph gives an exact(!) value for the total flux associated with the model (that is, the 0'th Fourier component). I remember I did some tests where I compared this value to the total fluxes in the raster images made with the traditional RATRAN-style raytracer. The values would always differ significantly for typical pixel resolutions (although it would improve by going to a ridiculous number of pixels and oversampling with an equally ridiculous performance penalty).

allegroLeiden commented 6 years ago

This is essentially the algorithm preserved in present Lime, but only for pixels for which there are fewer than ~2 rays per pixel. I have no beef with it, it is a good and quick algorithm. However the present issue, as I understand it, concerns pixels for which there are >>1 rays per pixel. There is no way to sensibly interpolate a value for these, you have to average. I like @mcordiner's suggestion of weighting by the 2D Voronoi area, it is the best approach I have heard for these dense pixels, but I'm not going to code it up myself. I am free of Lime shackles, going to dive into some of my own long put off research. ;)

christianbrinch commented 6 years ago

It is true that interpolation is no good when you have many patches per pixel. Sheppards method would be better in that case.

I dug up this figure from an ancient presentation. It shows (left) the traditional raster, (middle) a triangulated image based on the (jiggled) pixel centers, and (right) a triangulated image with adaptive resolution. The point here is that the number of rays is the same in all three versions, but the fidelity improves dramatically in the right image. (about the jagged outer edge: this figure was made in early 2010 with a very premature version of the algorithm.)

mcordiner commented 6 years ago

Thanks for the interesting discussion.

Recently, I've taken to producing an overly-large image, then generating the visibilities using CASA for comparison to ALMA visibility data. Obviously this is highly inefficient and error prone, so I'd be very interested in using the Brinch & Dullemond method to generate them directly. How much would it take to get that working?

allegroLeiden commented 6 years ago

I don't think much. The starting values, the locations and values at each set of facet vertices, are already available in raytrace(). Feed these into the expression for the Fourier transform and away you go. The most laborious part would be coding up an interface to read the (u,v) coords at which you wanted visibilities, then write the visibilities. Got a student who can code a bit @mcordiner ?

christianbrinch commented 6 years ago

I think it would take a lot, I am afraid. If you want to generate visibilities that compares directly to your measured visibilities, you need to supply LIME with the exact (u,v)-settings of your run and possibly also the source position on the sky during observations and what not. The code to calculate the Fourier components is simple to implement and it is available here: https://github.com/christianbrinch/trixels (in Fortran though, but easily translatable), however, in order to be able to compete with CASA, you have to pick the visibilities very accurately. Then again, maybe it is not so hard...

EDIT: Notice that there are three different implementations in that library based on three different references. The default uses the algorithm from McInturff and Simon, 1991

ryanaloomis commented 6 years ago

For what it's worth, I wrote some visibility sampling that interfaces natively with CASA measurement sets, using the traditional pixelated image fourier transform algorithm implemented in MIRIAD. It's much faster than the CASA visibility generation, and produces identical results.

The file-handling portions of the code that read-in and write-out measurement sets might be useful if you want to implement something similar for direct visibility generation.

ryanaloomis commented 6 years ago

Looks like the link to the code didn't send properly... https://github.com/AstroChem/vis_sample

mcordiner commented 6 years ago

Sorry, I'm a bit confused - I thought the Brinch & Dullemond method was a Fourier method, yet from reading the above explanations, it seems that no Fourier transform was involved in the 1.4 implementation...?

Anyway, it sounds like getting fluxes as a function of u,v wouldn't be that hard. This would actually be very useful for my current application so I might have a go. Christian, can you clarify the meaning of nc,nt,nd,X,Y,Z,TR in your trixels/Fourier subroutine?

mcordiner commented 6 years ago

Thanks @ryanaloomis, could be useful! I was never very happy about having to fire up CASA to do this, and taking the relevant uv points directly from the ms is a clever idea.

christianbrinch commented 6 years ago

Let me see... In this code, I needed to plot the resulting Fourier transforms in order to compare with FFT, so I needed to mimic the FFT. ND is the number of pixels per dimension in the (u,v)-plane. X, Y are (x,y) in the image plane and Z is the intensity. TR is the array that holds the triangular patches. nt is the number of triangles and nc... possibly not used?

EDIT: In essence, (1,...,ND;1,...,ND) are the (u,v) points, in this case chosen to be equal to what FFT provides.

christianbrinch commented 6 years ago

Btw, yes, Brinch and Dullemond is a Fourier method, but it relies on a set of triangulated intensity points. We prep'ed LIME by doing the "trixel" image part, but never got to the Fourier transformation part. Money ran out, I guess...

lime-rt / lime

problem in LIME raytracer #266