swharden / JLJP

Java application for calculating liquid junction potential (LJP)
https://swharden.com/software/LJPcalc
MIT License
2 stars 0 forks source link

calculated LJP is different every time it is calculated #6

Closed swharden closed 3 years ago

swharden commented 4 years ago

I calculated LJP using values from the screenshot 10000 times and was surprised to observe several microvolts of variance. I'm not sure yet where this comes from. Python script and raw data is in /dev/variance

It should probably be addressed somewhere that repeatedly running the same calculation will yield non-identical results. This is also critically important to know when writing tests.

image

swharden commented 4 years ago

No longer seems to be an issue. Must have related to equation parsing.

Best place to see this is in the latest annotated branch: https://github.com/swharden/JLJP/branches

swharden commented 4 years ago

Actually this is still an issue. Where is this variance coming from? What is a typical expected variance?

IonSet isss = new IonSet();
isss.add(new Ion("Na", 50, 0));
isss.add(new Ion("Cs", 0, 50));
isss.add(new Ion("Cl", 50, 50));
double LJP_mV = isss.calculate(null) * 1000;
-5.028959 mV
-4.993543 mV
-5.023472 mV
-5.003093 mV
-4.997218 mV
-4.981479 mV
-5.032667 mV
-5.000302 mV
-5.060976 mV
-5.026719 mV
-4.997025 mV
-4.975054 mV
-4.992667 mV
-4.990662 mV
-4.971077 mV
-5.014630 mV
-4.992542 mV
-5.009561 mV
-5.005144 mV
-4.997844 mV
-5.008323 mV
-4.991713 mV
-4.990557 mV
-4.937619 mV
-4.997312 mV
-5.009911 mV
-4.995466 mV
-4.991056 mV
-4.991040 mV
-5.003396 mV
-5.020330 mV
-4.956016 mV
-4.946139 mV
-5.026625 mV
-4.994901 mV
-5.005399 mV
-5.011265 mV
-5.025567 mV
-4.995444 mV
-5.009265 mV
-4.990847 mV
-4.995296 mV
-4.990424 mV
-5.059080 mV
-5.053474 mV
-4.958179 mV
-5.005813 mV
-4.992388 mV
-5.042123 mV
-5.039945 mV
-4.994957 mV
-4.991009 mV
-5.008346 mV
-4.995729 mV
-5.043840 mV
-4.963189 mV
-5.006962 mV
-5.039178 mV
-5.017840 mV
-4.996537 mV
-4.993699 mV
-5.001916 mV
-4.992125 mV
-4.994199 mV
-5.003445 mV
-4.992407 mV
-5.008180 mV
-5.012777 mV
-4.992021 mV
-4.994069 mV
-5.066512 mV
-5.061516 mV
-5.009421 mV
-4.991745 mV
-5.007157 mV
-4.990617 mV
-5.001655 mV
-4.990923 mV
-5.020150 mV
-4.992560 mV
-4.972192 mV
-4.996225 mV
-4.993915 mV
-5.024982 mV
-4.999678 mV
-5.031369 mV
-4.990769 mV
-4.992289 mV
-5.001920 mV
-4.995057 mV
-5.005641 mV
-4.993059 mV
-5.005197 mV
-4.936821 mV
-5.008438 mV
-5.024742 mV
-4.941860 mV
-5.013553 mV
-5.015968 mV
-5.039013 mV
swharden commented 4 years ago

After looking into it further I found it is because random number generation is used in Solver.suggest()

https://github.com/swharden/JLJP/blob/09d3e73d799bcd7540f3157a872d1d891b7716ca/src/Solver.java#L92

https://github.com/swharden/JLJP/blob/09d3e73d799bcd7540f3157a872d1d891b7716ca/src/Solver.java#L96

https://github.com/swharden/JLJP/blob/09d3e73d799bcd7540f3157a872d1d891b7716ca/src/Solver.java#L121

The solution for totally repeatable LJPs is to used a fixed seed for JAVA's random number generator. This was done in LJPcalc, but probably doesn't matter either way because its effect on the calculated LJP is so small. The histogram is interesting to ponder though.

Now that the source of the variability is known it's not really an issue anymore.

dbrogioli commented 4 years ago

@swharden: you correctly identified the source of variation. I deliberately used a solver partially based on random calculations: it is easier to write and it is more likely that it gives at least a rough approximation of the result .

Moreover, I deliberately decided to use a random seed: the distribution of the results gives an idea of the precision of the calculation. You can notice that, in particular in some pathologic cases, that the error depends on the choice of "Last" and "x" ions. This is the reason why the user can select them.

So, we should explain in detail this point in the manual. If we want to improve the program, I do not suggest to use a more deterministic solver. Rather, if you really want to do something better, I would automatically repeat the calculation (with different seeds of course) and give the average and the RMS representing the evaluated error of the calculation. Let us think about this!

swharden commented 4 years ago

the distribution of the results gives an idea of the precision

I question the relevance of this level of precision... small variations in temperature/mobility will vary LJP meaningfully, whereas the standard deviation here due to the random number generator is less than a microvolt.

Rather, if you really want to do something better, I would automatically repeat the calculation (with different seeds of course) and give the average and the RMS representing the evaluated error of the calculation.

RMS may not be perfect since the results don't seem to be normally distributed, but I see your point here... the LJP calculation takes a non-trivial amount of time on my machine, so running 100s of calculations would make the application feel prety laggy.

Perhaps this topic doesn't merit a GUI change, but it could be further researched to make an excellent figure/discussion in the manuscript.