Closed brandy125 closed 4 years ago
You can use triple back-ticks to indicate code fragments, and get IDL syntax highlighting:
```idl
print, "hello world"
gives
```idl
print, "hello world"
Thank you slayoo!
hi @brandy125 , if I do num=10
and then
a=randomn(seed,num,num)
a(num * 0.2:num * 0.4,num * 0.2:num * 0.4)=-2 ;this points should be interpolated with trigrid
tmp=where(a ne -2)
ind=array_indices(a,tmp)
triangulate,ind(0,0:num-1),ind(1,0:num-1),triangles
with IDL, it says:
IDL> triangulate,ind(0,0:num-1),ind(1,0:num-1),triangles
% TRIANGULATE: Points are co-linear, no solution.
% Execution halted at: $MAIN$
So the example needs to be reviewed.
There are 2 special points to keep in mind: even is seed is the same (say seed=33) the random generator of GDL, while still the same "mersenne twsiter" generator as IDL's, is a parallell opensource optimized version that does not produce the random numbers in the same order.
To mimic exactly IDL's random values (if necessary) one has to use the (IDL compatible) /ran1 flag of the random[n|u] commands, like in a=randomn(33,num,num,/ran1)
.
The second point is known, and should have been cured since (my fault), there is no filtering of duplicate points in GDL's implementation, and the triangulation code (incidentally, the very same as IDL's) absolutely dislikes identical points in a triangulation mesh.
This simple thing works:
; Make 50 normal x, y points:
x = RANDOMN(seed, 50)
y = RANDOMN(seed, 50)
; Make the Gaussian:
z = EXP(-(x^2 + y^2))
PLOT, x, y,psym=1
TRIANGULATE, x, y, tr, b
CONTOUR, TRIGRID(x, y, z, tr),/noeras
wait,2
SURFACE, TRIGRID(x, y, z, tr)
Hi Gilles,
thank you for your comments.
I should have been more clear in my description and in principal avoid using random numbers if they are not necessary for an example. What I wanted to show is that a) the triangles result from GDL is different from IDL (it is bigger and has strange triangles) and b) the triangles result from GDL causes NAN in the trigrid results
Here is a more simple version that shows what I wanted to show:
num=5
a=fltarr(num,num)+1
a(2,2)=-2 ;this point should be interpolated with trigrid
tmp=where(a ne -2)
ind=array_indices(a,tmp)
triangulate,ind(0,*),ind(1,*),triangles
tmp2=trigrid(ind(0,*), ind(1,*), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l])
print,a
GDL> print,a
1.00000 1.00000 1.00000 1.00000 1.00000
1.00000 1.00000 1.00000 1.00000 1.00000
1.00000 1.00000 -2.00000 1.00000 1.00000
1.00000 1.00000 1.00000 1.00000 1.00000
1.00000 1.00000 1.00000 1.00000 1.00000
GDL> print,tmp2
-NaN -NaN -NaN -NaN -NaN
-NaN -NaN -NaN -NaN -NaN
-NaN -NaN -NaN -NaN -NaN
-NaN -NaN -NaN -NaN -NaN
-NaN -NaN -NaN -NaN -NaN
Since your example is working as expected, what could be the error that causes the NAN in my deterministic example? ...and is there a chance to repair it?
Thanks,
After I have learned from @slayoo how to post code correctly I reviewed and corrected my first post and saw that it contains errors that lead to the behaviour that @GillesDuvert explained in his post.
Let me post a more simple example avoiding random numbers here:
1) IDL:
num=3
a=fltarr(num,num)+1
a(1,1)=-2 ;this points should be interpolated with trigrid
tmp=where(a ne -2)
ind=array_indices(a,tmp)
triangulate,ind(0,*),ind(1,*),triangles
tmp2=trigrid(ind(0,*), ind(1,*), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l])
IDL> print,a
1.00000 1.00000 1.00000
1.00000 -2.00000 1.00000
1.00000 1.00000 1.00000
IDL> print,tmp2
1.00000 1.00000 1.00000
1.00000 1.00000 1.00000
1.00000 1.00000 1.00000
2) GDL:
num=3
a=fltarr(num,num)+1
a(1,1)=-2 ;this points should be interpolated with trigrid
tmp=where(a ne -2)
ind=array_indices(a,tmp)
triangulate,ind(0,*),ind(1,*),triangles
tmp2=trigrid(ind(0,*), ind(1,*), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l])
GDL> print,a
1.00000 1.00000 1.00000
1.00000 -2.00000 1.00000
1.00000 1.00000 1.00000
GDL> print,tmp2
-NaN -NaN -NaN
-NaN -NaN -NaN
-NaN -NaN -NaN
As I said in my first post, using the triangulate result from IDL for the trigrid function in GDL it works, so the problem is the triangulate procedure.
OK @brandy125 , with that we'll find the culprit!
the problem is around lines 573 in "triangulation.cpp"
in "C" term, what should happen when denominator is zéro ? And also when the numerator is zéro too ?
we may have also a problem for B when delx10 is zéro.
I just add two comment lines here
cout <<(delx21 * delz10 - delx10 * delz21) << endl;
cout <<(delx21 * dely10 - delx10 * dely21)<< " "<< C << " " << B << " " << A << endl;
and on the example provided by @brandy125 I got:
GDL> @test_brandy125.pro
% Compiled module: ARRAY_INDICES.
2 2 3 3
-0
-0 -nan -nan -nan
-0
-1 0 0 1
-0
-2 0 0 1
0
-1 -0 0 1
-0
-2 0 0 1
-0
-1 0 0 1
-0
-1 0 -nan -nan
CQFD (as we say in french ;)
Related: #536
Triangulate() was using the code of Renka in a too direct way. IDL is much more acceptant of duplicate and colinear points AND preserves the data. A patch will be preseneted ASAP. In particular null trianges (with zero area) should be properly eliminated. Thanks for reminding us this was a job half-done.
patched by #807 Please open a new issue in case of trouble.
Thank you for the patch. It is much better now but still not as in IDL. The last element of my example still is NAN:
num=3 a=fltarr(num,num)+1 a(1,1)=-2 ;this points should be interpolated with trigrid tmp=where(a ne -2) ind=array_indices(a,tmp) triangulate,ind(0,),ind(1,),triangles tmp2=trigrid(ind(0,), ind(1,), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l]) GDL> print,a 1.00000 1.00000 1.00000 1.00000 -2.00000 1.00000 1.00000 1.00000 1.00000 GDL> print,tmp2 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 -NaN
Any chance to fix this as well?
On 1. Jul 2020, at 06:15, GillesDuvert notifications@github.com wrote:
patched by #807 https://github.com/gnudatalanguage/gdl/pull/807 Please open a new issue in case of trouble.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gnudatalanguage/gdl/issues/741#issuecomment-652298772, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOC5K6FOK3XJIOH73HB7BILRZL5BNANCNFSM4MPJAYXQ.
Hi @brandy125 thanks for the reply. Indeed there is still a bug due to the 3 first point colinear, the indexes are perturbed. Easy patch.
Pushed #810
merged #810 please test and very many thanks!
Many thanks back to you for working on this! Unfortunately it works only for n=3. In case of n=4 it produces already many NAN:
bash$ ./gdl
GDL - GNU Data Language, Version 1.0.0-rc.3 git
GDL> num=3
GDL> a=fltarr(num,num)+1
GDL> a(1,1)=-2 ;this points should be interpolated with trigrid
GDL> tmp=where(a ne -2)
GDL> ind=array_indices(a,tmp)
GDL> triangulate,ind(0,*),ind(1,*),triangles
GDL> tmp2=trigrid(ind(0,*), ind(1,*), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l])
GDL> print,a
1.00000 1.00000 1.00000
1.00000 -2.00000 1.00000
1.00000 1.00000 1.00000
GDL> print,tmp2
1.00000 1.00000 1.00000
1.00000 1.00000 1.00000
1.00000 1.00000 1.00000
GDL>
GDL> num=4
GDL> a=fltarr(num,num)+1
GDL> a(1,1)=-2 ;this points should be interpolated with trigrid
GDL> tmp=where(a ne -2)
GDL> ind=array_indices(a,tmp)
GDL> triangulate,ind(0,*),ind(1,*),triangles
GDL> tmp2=trigrid(ind(0,*), ind(1,*), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l])
GDL> print,a
1.00000 1.00000 1.00000 1.00000
1.00000 -2.00000 1.00000 1.00000
1.00000 1.00000 1.00000 1.00000
1.00000 1.00000 1.00000 1.00000
GDL> print,tmp2
1.00000 1.00000 -NaN -NaN
1.00000 -NaN -NaN -NaN
-NaN -NaN -NaN -NaN
-NaN -NaN -NaN -NaN
GDL> ```
Could you please have a look again to the code?
@brandy125 :blush: it is indeed difficult to persuade the legacy "tripack" code to behave correctly in presence of N (n=4 here) aligned points at the beginning of the sequence... I will have to change the algorithm more profoundly. Stay tuned.
@brandy125 this should do it I hope.
reopen,was closed automatically
@GillesDuvert it's perfect! Thank you so much!
Let's keep it open until we have an automated test to avoid regression
@slayoo you sleep sometimes? :smile:
you mean, adding test cases to test_triangulate.pro ?
you mean, adding test cases to _testtriangulate.pro ?
I mean adding something somewhere that causes "make check" to fail with GDL compiled before the fix, and passes when the fix is in :)
... and something that will fail in case any future changes in the GDL code would make the examples mentioned above fail again.
in other terms, add the example above, that should give 1.0 everywhere. Not quite the complete story, but close.
I have a question regarding speed of trigrid and triangulate. I was testing the speed for triangulate and trigrid using the example above and compared IDL with GDL and got following results:
Here the program:
function tritest,num,dur
t=systime(1)
a=fltarr(num,num)+1
a(num/2,num/2)=-2 ;this points should be interpolated with trigrid
tmp=where(a ne -2)
ind=array_indices(a,tmp)
triangulate,ind(0,*),ind(1,*),triangles
tmp2=trigrid(ind(0,*), ind(1,*), a(tmp), triangles,[1l,1l],[0l,0l,num-1l,num-1l])
return,systime(1)-t
end
Seems that GDL is around 100-1000 times slower than IDL (and even more for bigger arrays). Is there a chance to improve the speed of triangulate and trigrid? Basically trigrid is very slow for bigger arrays.
@brandy125 IDL uses the divide-and-conquer method more recent and notoriously more efficient than the one (Renka) implemented in GDL. As usual our main goal is to provide compatibility and accurate results, then optimisation comes next. However there is still a margin of progression without using another algorithm. with this code:
(removed as wrong)
I get:
(idem, but summary below still holds)
In summary, triangulation code is optimized for non-periodic positions, grid positions being the worst possible case. I would argue your example is not representative of a 'normal' useage of TRIANGULATE. Still, IDL does better for random positions (because of the divide-and-conquer method), but GDL should handle gridded positions more, eh, carefully. There is plenty of room for improvement.
@GillesDuvert I agree that compatibility and accurate results is first priority.
If you have an idea how I can speed up some interpolations that I have to do on big regular arrays, please let me know. That scenario with a 1000x1000 regular image with only few points to interpolate is something that I would need. Is there a chance to mix the indices to speed it up?
Hello, I'm afraid there is not much else to do, unless you are satisfied by the results of
a[num/2,num/2]=!values.f_nan ; signal it is a bad value
b=median(a,3)
that works well on the example of course since there is no noise and only one pixel is blanked.
in fact the benchmark code above was wrong in terms of size of the problem. The good one is
pro benchmark_triangulate
for num=2,6,2 do begin
size=10LL^num-1
x=randomu(seed, size, /double, /ran1)
n=n_elements(x)
y=randomu(seed, size, /double, /ran1)
clock=tic("random "+strtrim(n,2)+" pts")
triangulate,x,y ,tr, b,conn=c
toc,clock
endfor
for num=2,6,2 do begin
size=sqrt(10LL^num)
a=fltarr(size,size)+1
a(size/2,size/2)=-2 ;this points should be interpolated with trigrid
tmp=where(a ne -2)
ind=double(array_indices(a,tmp))
x=reform(ind(0,*)) & y=reform(ind(1,*))
n=n_elements(x)
clock=tic("grid "+strtrim(n,2)+" pts")
triangulate,x,y ,tr, b,conn=c
toc,clock
endfor
for num=2,6,2 do begin
size=sqrt(10LL^num)
a=fltarr(size,size)+1
a(size/2,size/2)=-2 ;this points should be interpolated with trigrid
tmp=where(a ne -2)
ind=double(array_indices(a,tmp))
x=double(reform(ind[0,*])) & y=double(reform(ind[1,*]))
n=n_elements(x)
p=randomu(seed,n,/doub)-0.5d & x+=p*1e-6
p=randomu(seed,n,/doub)-0.5d & y+=p*1e-6
clock=tic("grid+rnd "+strtrim(n,2)+ " pts")
triangulate,x,y ,tr, b,conn=c
toc,clock
endfor
end
IDL benchmark is:
IDL> benchmark_triangulate
% Time elapsed random 99 pts: 0.00021982193 seconds.
% Time elapsed random 9999 pts: 0.017893076 seconds.
% Time elapsed random 999999 pts: 2.0584621 seconds.
% Time elapsed grid 99 pts: 0.0026829243 seconds.
% Time elapsed grid 9999 pts: 0.015437126 seconds.
% Time elapsed grid 999999 pts: 11.099156 seconds.
% Time elapsed grid+rnd 99 pts: 0.0055999756 seconds.
% Time elapsed grid+rnd 9999 pts: 0.010852098 seconds.
% Time elapsed grid+rnd 999999 pts: 1.8512330 seconds.
Which shows first that IDL is (as always) excellent in terms of energy saving and thus much greener than Python :wink: Second, adding some randomness makes wonders Third, GDL results are simply awful and GDL user should not try to run this benchmark, it will never end.
@GillesDuvert thanks for your effort. Actually I want to use the program for big size arrays in the order of 30000 x 30000 points. I am already dividing the array in sub-arrays but as you can imagine the time for interpolating this in GDL is "infinite" as in IDL already takes some hours to do the job.
Actually my big problem is not the triangulate program. Here I use a c-Program from "Jonathan Richard Shewchuk". Maybe you know this program already. If not, have a look to https://github.com/libigl/triangle. This program is only 4 times slower than IDL so it serves for my purpose.
Where I am stuck is the trigrid program. Here I have no alternative and it takes even much longer in GDL compared to triangulate, both compared to IDL.
Do you think there is any chance to speed up trigrid?
@brandy125 this is entirely another story, as I was sure that triangulate was the culprit. I was aware of Jonathan Richard Shewchuk's triangle code, I'm even surprised it is slower than IDL's. I did not use it because the Renka code was providing both 2d and spherical gridding with the same 'interface'. It should be noted that triangulation benefits from a pre-sorting of the data points, that IDL does. Perhaps hence its success. TRIGRID is very simply written and speedup can certainly be expected IMHO by parallelizing and using a divide&conquer methods. At the moment it is a simple loop. It's really a subject for a trainee, simple problem with great potential. @alaingdl , any idea? But then, there is no clear benchmark of TRIGRID showing slower than IDL's.
@brandy125 you should explain why a median filtering would not be sufficient (it's almost instantaneous, compared). Besides, as you know the regions where you want an interpolation, just triangulate the small patch of points around the 'bad' values.
@GillesDuvert regarding median filtering, the holes that I have to fill up are up to let's say 1-20 pixels in radius. A median filter of that size would change the data where are no holes. The example above with 1 pixel was only for simplifying the code to show here. If there is a chance to speed up by pre-sorting, could you give me an example of how to do? Hope that anybody has an idea of speeding up trigrid. It is much worse in speed than triangulation but gives very nice results.
@GillesDuvert regarding interpolate just the regions that I want to interpolate, I have holes from 1 to approx 1000 pixels in radius. In IDL I use label_region to sort them in size and do just the work where it needs to be done. In GDL I did not find this procedure. Is there any replacement for label_region?
I'm afraid label-region is not yet in GDL :frowning_face:
The benchmarks are so ridiculous, I'm looking into a new triangulation algorithm that would leave IDL very very far behind...
@GillesDuvert if you succeed in that ....I would be very very happy
@brandy125 already git clone -b triangulation git@github.com:GillesDuvert/gdl.git
is on par with IDL...
@GillesDuvert how the hell did you do that? I am impressed and very happy! Congratulation!
There are two issues that I observed:
1) The program crashes from time to time with segfault: Error in `gdl_trigrid': corrupted size vs. prev_size: 0x000000000221ea10 2) The program produces (quite) wrong results still while the results from the old program were quite identical to IDL
But we are on the right way! Speed is no issue any more.
Hi, I believe speed can be improved a little more. Essentially, the trick is to avoid costly computations and select adequate ranges of pixels . Perhaps too adequate, since you seem to get wrong results. My tests are probably too naive. Would you provide an example of wrong result? Segfault crashes we'll see to them afterwards.
@brandy125 I've a fast and seemingly accurate version now at git clone -b triangulation git@github.com:GillesDuvert/gdl.git
.
Could tou try it? Thanks.
@GillesDuvert Thanks again for the improved version.
It seems that it still has a problem when the number of elements in each dimension is not equal. I prepared a small program that shows this:
num1=128L & num2=64L
a=(indgen(num1*num2))^2 ; the wrapping (indgen instead of findgen) is desired here
b=reform(a,num1,num2)
c=float(b) & x=where(abs(b)/float(max(abs(b))) lt 0.5) & c(x)=-99999
tmp=where(c ne -99999)
ind=array_indices(c,tmp)
triangulate,ind(0,*),ind(1,*),triangles
tmp2=trigrid(ind(0,*), ind(1,*), c(tmp), triangles,[1l,1l],[0l,0l,num1-1l,num2-1l])
window,1,xsize=num1,ysize=num2
tv,bytscl(c,-1e5,1e5)
window,2,xsize=num1,ysize=num2
tv,bytscl(tmp2,-1e5,1e5)
The two images in the left column show the output of IDL and the images in the right column the results from GDL. The two images on the top show the data to be interpolated and below the results.
If the dimensions are equal the results are identical but as soon as one dimension is bigger, the results are wrong. Could you check this please?
Silly of me, wrong indexes... Pls try again. Thank you for your patience and for your clever testing! Roundoff errors may still be a problem, although I was not able to pinpoint a clear case yet...
Sorry. Just found cases where 'delaunator' makes crazy triangles, so trigrid fails. Too bad, as 'delaunator' was faster than anything else.
@GillesDuvert I was testing the new version and was impressed about the speed and accuracy. It was even faster than IDL. When I tried the program on a bigger dataset I found that sometimes it makes crazy things as you said. Do you think you can fix this?
The routine is the C++ version of a javaScript program and people there are aware of this bug, which arises when points are exactly aligned, which is your case. (Every triangulation program has a bad time with colinear points...) Certainly not a problem in the long run, it will be cured, and in the meantime adding a small random offset (say, 10^-6) to all points positions is the solution. I will fix this one way or another but in the meantime, you probably can work with the present version, just adding randomu(seed, ... ,/double)*1D-6 to the grid positions.
My fault, I was using a source from a clone instead of the real thing, that was updated to cure th problem. With the new version (git clone -b triangulation...
) the delaunator bugs with colinear points should have disappeared.
If everything is OK I'll submit a PR for GDL when I've added a few missing things.
I tested this new version and for my dataset it produces worse results. Could you please test as well?
trigrid is super fast now and produces good results. The problem is triangulation.
The triangulate procedure generates different results compared to IDL and thus causes trigrid to produce arrays full of NAN instead of interpolated values. If we use the triangulate results from IDL then trigrid works as expected.
Here comes three experiments. 1) is using IDL, 2) is using GDL and 3) is using GDL trigrid with IDL triangulation result:
1) IDL:
IDL> print,a 0.0763313 1.30794 0.213978 1.79363 0.807151 -1.08356 -2.00000 -2.00000 1.23840 0.0354341 -2.25286 -2.00000 -2.00000 0.343340 -1.05093 0.352284 -0.358769 -0.176580 -0.654924 0.177781 1.31091 -0.996503 0.420700 0.781461 0.0256880 IDL> print,tmp2 0.0763313 1.30794 0.213978 1.79363 0.807151 -1.08356 -0.309574 0.464411 1.23840 0.0354341 -2.25286 -0.906525 0.439814 0.343340 -1.05093 0.352284 -0.358769 -0.176580 -0.654924 0.177781 1.31091 -0.996503 0.420700 0.781461 0.0256880
2) GDL:
GDL> print,a -0.813149 -1.54935 -0.414475 0.479349 -0.437172 0.289173 -2.00000 -2.00000 1.05989 0.00470387 -0.801169 -2.00000 -2.00000 -0.962254 0.366669 1.24327 0.475119 0.722479 -0.811135 -0.251257 1.81459 0.752898 1.09253 0.622585 -0.339028 GDL> print,tmp2 -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN -NaN
3) GDL with triangles from IDL:
GDL> print,a 0.788906 0.375375 0.134277 -0.605974 -0.553893 0.0434618 -2.00000 -2.00000 1.49244 -0.118560 -1.28662 -2.00000 -2.00000 0.923461 1.61877 0.108984 -0.0982694 -0.389905 -2.69578 0.946032 -0.208400 -1.42433 -1.31319 -1.29373 -0.118738 GDL> print,tmp2 0.788906 0.375375 0.134277 -0.605974 -0.553893 0.0434618 0.526453 1.00944 1.49244 -0.118560 -1.28662 -0.294766 0.697083 0.923461 1.61877 0.108984 -0.0982694 -0.389905 -2.69578 0.946032 -0.208400 -1.42433 -1.31319 -1.29373 -0.118738
Is this behaviour known?
Sorry for the long post and the frequent updates to it.