Closed zengxizhou closed 3 years ago
It is unlikely about the data size. Seems some numeric stability issue (e.g. the kernel matrix is not positive). Without the training data, it is hard to find the root cause.
Dr. Li, Thanks lot for taking your time addressing my question.Here, I'm trying to use your SMILE package for a high profile project. For your reference, I have attached my Java code and the .csv data file, which is used as input points to the KrigingInterpolationValidator.txt (the java file). static int targetArrayLength = 8000; will lead to unexplainable huge values.If it's below 7000, looks like it's ok. Also, FYI, I also tried the RBFInterpolation2D class, using the same data input, but it's even producing incorrect interpolated values with much smaller targetArrayLength value. I really appreciate your help. regards --Zengxi Zhou
On Monday, November 9, 2020, 08:04:03 PM EST, Haifeng Li <notifications@github.com> wrote:
It is unlikely about the data size. Seems some numeric stability issue (e.g. the kernel matrix is not positive). Without the training data, it is hard to find the root cause.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Date; import java.util.Scanner; import java.util.stream.Stream;
import smile.interpolation.KrigingInterpolation2D; import smile.interpolation.RBFInterpolation2D; import smile.math.rbf.GaussianRadialBasis;
/**
@author Haifeng Li */ public class KrigingInterpolationValidator { static SimpleDateFormat sweeptimeFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'"); // in mills seconds static long startTime = -1; // SimpleDateFormat inputFormat = new // SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
private static int timeBucketSizeMills = 5 60 1000;
// private static int indexSweeptime = 1; private static int indexValue = 0; private static int indexLat = 1; private static int indexLon = 2;
private static int totalLine = 0; private static int totalFile = 0;
// static int targetArrayLength =5000; static int targetArrayLength = 8000;
static int arrayIndex = 0;
static double[] lon = new double[targetArrayLength]; static double[] lat = new double[targetArrayLength]; static double[] value = new double[targetArrayLength];
private static ArrayList
private double[] x1; private double[] x2; private double[] yvi; private double alpha; private double beta;
/**
1 ≤ β < 2
. A good generalpublic static int processCsvFile(File file) {
// Get scanner instance
BufferedReader fileReader = null;
String line = null;
// Delimiter used in CSV file
final String DELIMITER = ",";
try {
fileReader = new BufferedReader(new FileReader(file));
// The first line of file
// if ((line = fileReader.readLine()) != null) {
// System.out.println(line);
// }
// Read the file line by line
while ((line = fileReader.readLine()) != null) {
// Get all tokens available in line
String[] tokens = line.split(DELIMITER);
// for (String token : tokens) {
// Print all tokens
// Date sweeptime = sweeptimeFormat.parse(tokens[indexSweeptime]);
// long delta = sweeptime.getTime() - startTime;
// inBucket = (delta >= 0) && (delta < timeBucketSizeMills);
totalLine++;
if (totalLine % 50000 == 0) {
// System.out.println("---- " + tokens[indexValue] + " "
// + tokens[indexLat] + " " + tokens[indexLon]);
}
//
value[arrayIndex] = Double.parseDouble(tokens[indexValue]);
lat[arrayIndex] = Double.parseDouble(tokens[indexLat]);
lon[arrayIndex] = Double.parseDouble(tokens[indexLon]);
arrayIndex++;
if (arrayIndex == targetArrayLength) {
System.out.println("******************************");
System.out.println("******************************");
return 1;
}
}
} catch (
Exception e) {
System.out.println("Error in KrigingInterplation.java:" + e.getMessage());
}
try {
// Do not forget to close the scanner
if (fileReader != null) {
fileReader.close();
}
} catch (Exception e) {
}
return 0;
}
public static void main(String[] args) throws ParseException { String validationDataFile = "d:/fromJavaInteresting.csv"; File file = new File(validationDataFile); processCsvFile(file);
// String formattedDate = outputFormat.format(date);
System.out.println(startTime); // prints 10-04-2018
System.out.println("totalLine in the time bucket=" + totalLine);
double valueMax = -9999;
double valueMin = 99999;
double lonMax = -9999;
double lonMin = 9999;
double latMax = -9999;
double latMin = 9999;
for (int i = 0; i < targetArrayLength; i++) {
if (value[i] > valueMax) {
valueMax = value[i];
}
if (value[i] < valueMin) {
valueMin = value[i];
}
if (lon[i] > lonMax) {
lonMax = lon[i];
}
if (lon[i] < lonMin) {
lonMin = lon[i];
}
if (lat[i] > latMax) {
latMax = lat[i];
}
if (lat[i] < latMin) {
latMin = lat[i];
}
// System.out.println(lon[i] + " " + lat[i] + " " + value[i]);
}
System.out.println(" min and max values: " + valueMin + "==>" + valueMax);
System.out.println(" min and max longitube: " + lonMin + "==>" + lonMax);
System.out.println(" min and max latitube: " + latMin + "==>" + latMax);
// KrigingInterpolationValidator krigingInterpolation = new
// KrigingInterpolationValidator(lon, lat, value);
KrigingInterpolation2D krigingInterpolation = new KrigingInterpolation2D(lon, lat, value);
// RBFInterpolation2D rbfInterpolation2D = new RBFInterpolation2D(lon, lat,
// value, new GaussianRadialBasis());
System.out.println("interpolated value= " + krigingInterpolation.interpolate(-85.327, 30.699));
} }
I don't see the csv file. Anyway, I guess that there are duplicated points in your data. Or some points are very very close to each other so that kernel matrix is close to singular. Please check it first.
Dr. Li, I think that something is missing in your code. I have a data file which only has 113 lines of data. It produced huge incorrect interpolated value. If I shuffle the data points, it could even lead to singular matrix.The data are from almost regularly spaced grid points, and ordered by latitude and longitude. I did not see any co-located data points, or very close points. I have appended this 113 lines of data at the end of this email.I'd really appreciate your help. thanks and regards -Zengxi ----------------------------------------------------------------------------- 7.5 , 25.725225 , -82.013512 6.0 , 25.725225 , -82.00901 4.875 , 25.725225 , -82.0 0.0 , 25.725225 , -81.968468 1.0 , 25.725225 , -81.963966 5.0 , 25.725225 , -81.954956 6.5 , 25.725225 , -81.945946 3.0 , 25.725225 , -81.941444 1.5 , 25.725225 , -81.88739 5.875 , 25.729731 , -82.013512 8.0 , 25.729731 , -82.0 2.5 , 25.729731 , -81.909912 1.5 , 25.729731 , -81.810814 6.0 , 25.734234 , -82.013512 0.5 , 25.734234 , -81.909912 2.5 , 25.734234 , -81.810814 6.0 , 25.734234 , -81.770271 4.75 , 25.734234 , -81.765762 5.0 , 25.738739 , -82.004501 6.25 , 25.738739 , -81.801804 4.5 , 25.743244 , -82.004501 2.5 , 25.743244 , -81.801804 5.5 , 25.743244 , -81.729729 3.25 , 25.747747 , -81.891891 6.5 , 25.747747 , -81.779282 5.0 , 25.747747 , -81.774773 0.5 , 25.752253 , -81.846848 0.5 , 25.756756 , -81.986488 2.0 , 25.756756 , -81.963966 2.0 , 25.756756 , -81.959457 2.5 , 25.756756 , -81.846848 9.0 , 25.756756 , -81.783783 4.75 , 25.756756 , -81.779282 2.5 , 25.756756 , -81.774773 2.0 , 25.761261 , -81.986488 5.0 , 25.761261 , -81.963966 5.0 , 25.761261 , -81.959457 10.5 , 25.761261 , -81.783783 3.5 , 25.761261 , -81.729729 3.5 , 25.765766 , -82.00901 3.5 , 25.765766 , -81.873871 9.5 , 25.765766 , -81.86937 4.166667 , 25.765766 , -81.729729 4.0 , 25.770269 , -81.882881 8.5 , 25.770269 , -81.87838 4.0 , 25.770269 , -81.86937 6.5 , 25.770269 , -81.779282 4.5 , 25.770269 , -81.774773 5.0 , 25.774775 , -82.0 7.5 , 25.774775 , -81.873871 7.5 , 25.774775 , -81.86937 2.0 , 25.77928 , -82.0 2.75 , 25.788288 , -81.779282 6.0 , 25.792793 , -81.900902 6.75 , 25.792793 , -81.896393 3.833333 , 25.797297 , -81.981979 10.5 , 25.797297 , -81.981979 6.0 , 25.797297 , -81.900902 5.5 , 25.806307 , -81.950447 4.833333 , 25.806307 , -81.945946 2.0 , 25.81081 , -81.950447 3.5 , 25.81081 , -81.842339 8.0 , 25.81081 , -81.837837 1.0 , 25.815315 , -81.792793 1.0 , 25.81982 , -81.797295 1.0 , 25.81982 , -81.792793 3.5 , 25.824324 , -81.774773 3.5 , 25.824324 , -81.770271 3.5 , 25.828829 , -81.995499 7.375 , 25.828829 , -81.99099 4.5 , 25.833334 , -81.995499 4.5 , 25.833334 , -81.828827 1.0 , 25.837837 , -81.99099 3.0 , 25.837837 , -81.747749 4.0 , 25.842342 , -81.99099 6.0 , 25.842342 , -81.828827 7.0 , 25.842342 , -81.824326 5.75 , 25.842342 , -81.797295 1.25 , 25.842342 , -81.747749 2.625 , 25.851351 , -81.945946 3.5 , 25.851351 , -81.761261 7.0 , 25.855856 , -81.945946 4.0 , 25.855856 , -81.797295 6.5 , 25.855856 , -81.761261 1.75 , 25.860361 , -81.945946 2.5 , 25.860361 , -81.797295 3.0 , 25.860361 , -81.788292 5.0 , 25.860361 , -81.783783 4.5 , 25.878378 , -81.963966 4.5 , 25.882883 , -81.810814 9.5 , 25.887388 , -81.810814 7.25 , 25.887388 , -81.788292 5.0 , 25.887388 , -81.774773 5.0 , 25.887388 , -81.747749 2.5 , 25.887388 , -81.74324 1.5 , 25.891891 , -81.788292 7.0 , 25.891891 , -81.774773 3.8 , 25.896397 , -81.74324 1.5 , 25.896397 , -81.738739 4.75 , 25.914415 , -81.788292 7.5 , 25.914415 , -81.783783 3.75 , 25.914415 , -81.770271 3.5 , 25.941441 , -81.842339 3.0 , 25.950451 , -81.950447 5.5 , 25.950451 , -81.945946 3.0 , 25.954954 , -81.86937 5.0 , 25.959459 , -81.873871 5.0 , 25.959459 , -81.86937 2.75 , 25.972973 , -81.774773 8.5 , 25.981981 , -81.779282 4.0 , 25.986486 , -81.779282 6.0 , 26.004505 , -81.819817 5.0 , 26.004505 , -79.815315 ---------------------------------------------------------------------------------------------------------------------------------
On Tuesday, November 10, 2020, 05:00:22 PM EST, Haifeng Li <notifications@github.com> wrote:
I don't see the csv file. Anyway, I guess that there are duplicated points in your data. Or some points are very very close to each other so that kernel matrix is close to singular. Please check it first.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
The linear system is singular on your data. I make some changes to handle it. You can try the master branch. You should also use a smaller beta
(e.g. 1.1).
Dr. Li, That's great. thanks a lot for your help and guidance. best regards -Zengxi
On Friday, November 13, 2020, 06:07:52 PM EST, Haifeng Li <notifications@github.com> wrote:
The linear system is singular on your data. I make some changes to handle it. You can try the master branch. You should also use a smaller beta (e.g. 1.1).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
My email: zengxiz@yahoo.com Thank you very much for your help.
Case 1: good result, 26.391575268331053 min and max values: 3.0==>50.5 min and max longitube: -88.0383==>-82.6822 min and max latitube: 27.2616==>32.833 arrayLength=7000 value[targetArrayLength-1]=27.5 lon[targetArrayLength-1]=-87.381 lat[targetArrayLength-1]=29.5112 interpolated value= 26.391575268331053
Case 2: bad result, huge value 9.2026976532600294E17 min and max values: 3.0==>50.5 min and max longitube: -88.039==>-82.6822 min and max latitube: 27.2616==>32.833 arrayLength=8000 interpolated value= 9.2026976532600294E17
===================== System.out.println(" min and max values: " + valueMin + "==>" + valueMax); System.out.println(" min and max longitube: " + lonMin + "==>" + lonMax); System.out.println(" min and max latitube: " + latMin + "==>" + latMax); KrigingInterpolation2D krigingInterpolation = new KrigingInterpolation2D(lon, lat, value); ...... System.out.println("interpolated value= " + krigingInterpolation.interpolate(-85.327, 30.699));
My email: zengxiz@yahoo.com
============================================================================== Input data The sample data
Additional context