xia-lab / OptiLCMS

R package for optimized LC-MS spectra processing
Other
21 stars 4 forks source link

Issues with GaussianSI (unstable) & CV (extremely stable) #3

Open tew42 opened 3 years ago

tew42 commented 3 years ago

Hi @Zhiqiang-PANG ,

two more optimization algorithm issues/questions that I've come across...

1) The Gaussian peak percentage analysis appears quite unstable. When I look at optimization output for identical parameters, I get tables like the following. As you can see, all parameters are conserved across runs (as expected), EXCEPT GaussianSI.

exp num_peaks notLLOQP num_C13 PPS CV RCS GS GaussianSI
1 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.548484848484848
2 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.536363636363636
3 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.527878787878788
4 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.545454545454545
5 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.541212121212121
6 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.549090909090909
7 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.527878787878788
8 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.521818181818182
9 3333 1455 472 7.04774211937663 0.00402016726116383 186.364053915599 939.25 0.532121212121212

2) On the other hand, over a rather large parameter space (and for two different analyses), the biggest variation in CV I have seen is less than 0.00025% . Is this supposed to be such a narrow interval? It doesn't seem very meaningful to normalize such a small range and give it somewhat similar impact as RCS or GS (where variation is easily > 100%).

Zhiqiang-PANG commented 3 years ago

Thanks for your comment. The CV is really dependent on the specific data. That's why there is less weight for CV in QcoE.

tew42 commented 3 years ago

Any idea though why GaussianSI would be so unstable when running multiple times with same data & parameters?

Zhiqiang-PANG commented 3 years ago

Yes, this is caused by the random check on the peaks' gaussian fitting (does not check all peaks). This is designed to accelerate the optimization process without causing significant variation. I have find an approach to check all peaks but even faster than current strategy, is under testing and will be published, once stable in the coming months.