Closed tikigonzo closed 2 months ago
Now, when testing kNN with the Iris dataset I am erroring out with:
Test method MachineLearning.Test_kNN.Test_kNN_RegressionIrisDataset threw exception: System.ArgumentException: The y vector must be the same length as the x matrix.
My dataset is formatted like your housing data that was used in the test before, using 5 arrays with 4 being the features on 1 being the target. I have a feeling that the kNN algorithm favors rows being the datapoints and columns being the features, but I am not sure given the previous dataset. If this is the case, am I to rewrite the split algorithm to make it runnable in kNN without changing inputs or Argument statements?
I updated the test cases to be apples-to-apples with R
Wokring on the Train/Test split via Extension Methods and Subsets using the NextNRIntegers method you sent me. Here is my code for both methods:
public static int[] NextNRIntegers(Random random, int min, int max, int length ) { var integers = new List();
for(int i = min; i < max; i++)
{
integers.Add(i);
}
}
///
///
///
/// Random number generator for the indices using NextNRIntegers()
///
///
///
public static double[][] TrainTestSplit(int[] rng,int dataSize, double[][] data,bool testing = false)
{
// iterate through indices and then split
//int dataSize = data[0].Length; //amount of columns (150)
int subSampleTraining = (int)Math.Ceiling(0.7 * dataSize); // 70% training split (105)
int subSampleTesting = dataSize - subSampleTraining; // 45
}