3d EAF value produces slightly different output for Windows and Linux version

FergusRooney commented 1 year ago

After re-enabling eaf3d, I have added a log of the output for both versions in this folder

The values are mostly the same but are different in a non-significant way on certain data points. Here I have saved a copy of the eaf3d for the spherical dataset, recording [windows - linux]

The points with discrepancy between platforms include:

103332-10353

17468-17482 18764-18792 19385-19397

edit: See this folder

R package calculated spherical EAF Spherical dataset with filter_dominated Windows calculated spherical dataset EAF output Linux calculated spherical dataset EAF output

MLopez-Ibanez commented 1 year ago

Hi Fergus,

Could you tell me what code you have executed to get these points? It is strange that there is a difference. It could be due to rounding and equality tests giving different results, but I still find it surprising.

FergusRooney commented 1 year ago

Hi Manuel,

I have added a new branch for this issue, here is code that can be used to create the 3d eaf data for different OS And here is the code for comparing the 3d eaf code for different OS

The uniform dataset also had the same problem

I tested it using the current main commit build, f861d315a6dbd71e161d06087ed629211ba815e2

I remembered that I made a slight change to the macros ineaf.h a while ago but I don't think that could be causing it

MLopez-Ibanez commented 1 year ago

There is something wrong in your code but I'm not sure what it is yet.

The output cannot contain those 0.0 values. One easy test is to calculate the first level of the eaf. The first level of the eaf should produce the same output as doing:

dataset = eaf.read_datasets("spherical-250-10-3d.txt")
eaf.filter_dominated(dataset[:,:3])

FergusRooney commented 1 year ago

Hi Manuel, my apologies for the confusion. Are you speaking of the file "spherical_windows_minus_linux.txt"? - the reason that there are 0.0 values there is because this file is looking at the difference between the two the datasets on difference OS and there is no difference for most values. I have updated the structure to be more explicit.

I calculated the spherical dataset EAF using the R package. This was my method:

sph_dataset <- read_datasets(file="spherical-250-10-3d.txt")
sph_eaf <- eafs(sph_dataset[,1:3], sph_dataset[,4])
write.table(sph_eaf, "r_sphere_eaf.txt", row.names=FALSE, col.names=FALSE)

and I got this result, the values are completely different to the python calculated version. The number of rows are different (73285 for R vs 19418 for Python). I have tested the 2d dataset "input1.dat" and it is producing the same result in R and Python

The first level of the eaf should produce the same output as doing:

Is the first level equal to the 10th percentile from the calculated EAF in this dataset? I calculated the non-dominated version of the dataset here but it doesn't seem to be equal to the 10th percentile for any of the results.

MLopez-Ibanez commented 1 year ago

I calculated the spherical dataset EAF using the R package. This was my method:
sph_dataset <- read_datasets(file="spherical-250-10-3d.txt")
sph_eaf <- eafs(sph_dataset[,1:3], sph_dataset[,4])
write.table(sph_eaf, "r_sphere_eaf.txt", row.names=FALSE, col.names=FALSE)
and I got this result, the values are completely different to the python calculated version. The number of rows are different (73285 for R vs 19418 for Python). I have tested the 2d dataset "input1.dat" and it is producing the same result in R and Python

Then there is something wrong in the Python version.

The first level of the eaf should produce the same output as doing:

Is the first level equal to the 10th percentile from the calculated EAF in this dataset? I calculated the non-dominated version of the dataset here but it doesn't seem to be equal to the 10th percentile for any of the results.

library(eaf)
sph_dataset <- read_datasets(file="~/work/FergusRooney/eafpy/tests/test_data/spherical-250-10-3d.txt")
sph_eaf <- eafs(sph_dataset[,1:3], sph_dataset[,4])
sph_eaf0 <- sph_eaf[sph_eaf[,4] <= 10,][,1:3]
best <- filter_dominated(sph_dataset[,1:3])
sph_eaf0 <- sph_eaf0[order(sph_eaf0[,1],sph_eaf0[,2]),]
best <- best[order(best[,1],best[,2]),]
sum((sph_eaf0 - best)**2)

gives 0 because they are identical matrices.

FergusRooney commented 1 year ago

Then there is something wrong in the Python version.

I have looked into it and found that the get_eaf C function is producing incorrect output, it is mutating the order and producing less EAF points than it should. My debugging so far has part linked part of the issue to being before or during the call of the attsurf function. The expected outputs in the testset need to be re-generated using the R librarybecause they are false positives at the moment. I am away for 1 week from today but will keep working on it when I get back.

FergusRooney commented 1 year ago

This issue seems fixed now after resolving problem with get_eaf C code

auto-optimization / eafpy

3d EAF value produces slightly different output for Windows and Linux version #15