haifengl / smile

Statistical Machine Intelligence & Learning Engine
https://haifengl.github.io
Other
6.04k stars 1.13k forks source link

KrigingInterpolation2D interpolation got INCORRECT HUGE value while dimension (data points) reach 8000 #615

Closed zengxizhou closed 3 years ago

zengxizhou commented 3 years ago

My email: zengxiz@yahoo.com Thank you very much for your help.

Case 1: good result, 26.391575268331053 min and max values: 3.0==>50.5 min and max longitube: -88.0383==>-82.6822 min and max latitube: 27.2616==>32.833 arrayLength=7000 value[targetArrayLength-1]=27.5 lon[targetArrayLength-1]=-87.381 lat[targetArrayLength-1]=29.5112 interpolated value= 26.391575268331053

Case 2: bad result, huge value 9.2026976532600294E17 min and max values: 3.0==>50.5 min and max longitube: -88.039==>-82.6822 min and max latitube: 27.2616==>32.833 arrayLength=8000 interpolated value= 9.2026976532600294E17

===================== System.out.println(" min and max values: " + valueMin + "==>" + valueMax); System.out.println(" min and max longitube: " + lonMin + "==>" + lonMax); System.out.println(" min and max latitube: " + latMin + "==>" + latMax); KrigingInterpolation2D krigingInterpolation = new KrigingInterpolation2D(lon, lat, value); ...... System.out.println("interpolated value= " + krigingInterpolation.interpolate(-85.327, 30.699));

My email: zengxiz@yahoo.com

<dependency>
        <groupId>com.github.haifengl</groupId>
        <artifactId>smile-interpolation</artifactId>
        <version>2.5.3</version>
    </dependency>
    <dependency>
        <groupId>com.github.haifengl</groupId>
        <artifactId>smile-core</artifactId>
        <version>2.5.3</version>
    </dependency>

============================================================================== Input data The sample data

Additional context

haifengl commented 3 years ago

It is unlikely about the data size. Seems some numeric stability issue (e.g. the kernel matrix is not positive). Without the training data, it is hard to find the root cause.

zengxizhou commented 3 years ago

Dr. Li, Thanks lot for taking your time addressing my question.Here, I'm trying to use your SMILE package for a high profile project. For your reference, I have attached my Java code and the .csv data file, which is used as input points to the KrigingInterpolationValidator.txt (the java file). static int targetArrayLength = 8000; will lead to  unexplainable huge  values.If it's below 7000, looks like it's ok. Also,  FYI, I also tried the RBFInterpolation2D class, using the same data input, but it's even producing incorrect interpolated values with much smaller targetArrayLength value. I really appreciate your help. regards --Zengxi Zhou

On Monday, November 9, 2020, 08:04:03 PM EST, Haifeng Li <notifications@github.com> wrote:  

It is unlikely about the data size. Seems some numeric stability issue (e.g. the kernel matrix is not positive). Without the training data, it is hard to find the root cause.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Date; import java.util.Scanner; import java.util.stream.Stream;

import smile.interpolation.KrigingInterpolation2D; import smile.interpolation.RBFInterpolation2D; import smile.math.rbf.GaussianRadialBasis;

/**

haifengl commented 3 years ago

I don't see the csv file. Anyway, I guess that there are duplicated points in your data. Or some points are very very close to each other so that kernel matrix is close to singular. Please check it first.

zengxizhou commented 3 years ago

Dr. Li, I think that something is missing in your code. I have a data file which only has 113 lines of data. It produced huge incorrect interpolated value. If I shuffle the data points, it could even lead to singular matrix.The data are from almost regularly spaced grid points, and ordered by latitude and longitude. I did not see any co-located data points, or very close points. I have appended this 113 lines of data at the end of this email.I'd really appreciate your help. thanks and regards -Zengxi -----------------------------------------------------------------------------          7.5      ,   25.725225 ,   -82.013512           6.0      ,   25.725225 ,   -82.00901            4.875    ,   25.725225 ,   -82.0                0.0      ,   25.725225 ,   -81.968468           1.0      ,   25.725225 ,   -81.963966           5.0      ,   25.725225 ,   -81.954956           6.5      ,   25.725225 ,   -81.945946           3.0      ,   25.725225 ,   -81.941444           1.5      ,   25.725225 ,   -81.88739            5.875    ,   25.729731 ,   -82.013512           8.0      ,   25.729731 ,   -82.0                2.5      ,   25.729731 ,   -81.909912           1.5      ,   25.729731 ,   -81.810814           6.0      ,   25.734234 ,   -82.013512           0.5      ,   25.734234 ,   -81.909912           2.5      ,   25.734234 ,   -81.810814           6.0      ,   25.734234 ,   -81.770271           4.75     ,   25.734234 ,   -81.765762           5.0      ,   25.738739 ,   -82.004501           6.25     ,   25.738739 ,   -81.801804           4.5      ,   25.743244 ,   -82.004501           2.5      ,   25.743244 ,   -81.801804           5.5      ,   25.743244 ,   -81.729729           3.25     ,   25.747747 ,   -81.891891           6.5      ,   25.747747 ,   -81.779282           5.0      ,   25.747747 ,   -81.774773           0.5      ,   25.752253 ,   -81.846848           0.5      ,   25.756756 ,   -81.986488           2.0      ,   25.756756 ,   -81.963966           2.0      ,   25.756756 ,   -81.959457           2.5      ,   25.756756 ,   -81.846848           9.0      ,   25.756756 ,   -81.783783           4.75     ,   25.756756 ,   -81.779282           2.5      ,   25.756756 ,   -81.774773           2.0      ,   25.761261 ,   -81.986488           5.0      ,   25.761261 ,   -81.963966           5.0      ,   25.761261 ,   -81.959457          10.5      ,   25.761261 ,   -81.783783           3.5      ,   25.761261 ,   -81.729729           3.5      ,   25.765766 ,   -82.00901            3.5      ,   25.765766 ,   -81.873871           9.5      ,   25.765766 ,   -81.86937            4.166667 ,   25.765766 ,   -81.729729           4.0      ,   25.770269 ,   -81.882881           8.5      ,   25.770269 ,   -81.87838            4.0      ,   25.770269 ,   -81.86937            6.5      ,   25.770269 ,   -81.779282           4.5      ,   25.770269 ,   -81.774773           5.0      ,   25.774775 ,   -82.0                7.5      ,   25.774775 ,   -81.873871           7.5      ,   25.774775 ,   -81.86937            2.0      ,   25.77928  ,   -82.0                2.75     ,   25.788288 ,   -81.779282           6.0      ,   25.792793 ,   -81.900902           6.75     ,   25.792793 ,   -81.896393           3.833333 ,   25.797297 ,   -81.981979          10.5      ,   25.797297 ,   -81.981979           6.0      ,   25.797297 ,   -81.900902           5.5      ,   25.806307 ,   -81.950447           4.833333 ,   25.806307 ,   -81.945946           2.0      ,   25.81081  ,   -81.950447           3.5      ,   25.81081  ,   -81.842339           8.0      ,   25.81081  ,   -81.837837           1.0      ,   25.815315 ,   -81.792793           1.0      ,   25.81982  ,   -81.797295           1.0      ,   25.81982  ,   -81.792793           3.5      ,   25.824324 ,   -81.774773           3.5      ,   25.824324 ,   -81.770271           3.5      ,   25.828829 ,   -81.995499           7.375    ,   25.828829 ,   -81.99099            4.5      ,   25.833334 ,   -81.995499           4.5      ,   25.833334 ,   -81.828827           1.0      ,   25.837837 ,   -81.99099            3.0      ,   25.837837 ,   -81.747749           4.0      ,   25.842342 ,   -81.99099            6.0      ,   25.842342 ,   -81.828827           7.0      ,   25.842342 ,   -81.824326           5.75     ,   25.842342 ,   -81.797295           1.25     ,   25.842342 ,   -81.747749           2.625    ,   25.851351 ,   -81.945946           3.5      ,   25.851351 ,   -81.761261           7.0      ,   25.855856 ,   -81.945946           4.0      ,   25.855856 ,   -81.797295           6.5      ,   25.855856 ,   -81.761261           1.75     ,   25.860361 ,   -81.945946           2.5      ,   25.860361 ,   -81.797295           3.0      ,   25.860361 ,   -81.788292           5.0      ,   25.860361 ,   -81.783783           4.5      ,   25.878378 ,   -81.963966           4.5      ,   25.882883 ,   -81.810814           9.5      ,   25.887388 ,   -81.810814           7.25     ,   25.887388 ,   -81.788292           5.0      ,   25.887388 ,   -81.774773           5.0      ,   25.887388 ,   -81.747749           2.5      ,   25.887388 ,   -81.74324            1.5      ,   25.891891 ,   -81.788292           7.0      ,   25.891891 ,   -81.774773           3.8      ,   25.896397 ,   -81.74324            1.5      ,   25.896397 ,   -81.738739           4.75     ,   25.914415 ,   -81.788292           7.5      ,   25.914415 ,   -81.783783           3.75     ,   25.914415 ,   -81.770271           3.5      ,   25.941441 ,   -81.842339           3.0      ,   25.950451 ,   -81.950447           5.5      ,   25.950451 ,   -81.945946           3.0      ,   25.954954 ,   -81.86937            5.0      ,   25.959459 ,   -81.873871           5.0      ,   25.959459 ,   -81.86937            2.75     ,   25.972973 ,   -81.774773           8.5      ,   25.981981 ,   -81.779282           4.0      ,   25.986486 ,   -81.779282           6.0      ,   26.004505 ,   -81.819817           5.0      ,   26.004505 ,   -79.815315 ---------------------------------------------------------------------------------------------------------------------------------

On Tuesday, November 10, 2020, 05:00:22 PM EST, Haifeng Li <notifications@github.com> wrote:  

I don't see the csv file. Anyway, I guess that there are duplicated points in your data. Or some points are very very close to each other so that kernel matrix is close to singular. Please check it first.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

haifengl commented 3 years ago

The linear system is singular on your data. I make some changes to handle it. You can try the master branch. You should also use a smaller beta (e.g. 1.1).

zengxizhou commented 3 years ago

Dr. Li, That's great. thanks a lot for your help and guidance. best regards -Zengxi

On Friday, November 13, 2020, 06:07:52 PM EST, Haifeng Li <notifications@github.com> wrote:  

The linear system is singular on your data. I make some changes to handle it. You can try the master branch. You should also use a smaller beta (e.g. 1.1).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.