Open smk78 opened 4 years ago
User has sent me a project, it immediately crashes sasview 5.0.2 on my pc with Nvidia Quadro, I have extracted a smaller part of it here (rename .txt to .json). I have big issues trying to switch my OpenCL on/off could some others try this project please.
I suspect that there is a numerical issue in the sticky hard sphere coupled to polydispersity, as some other tests I did with the same models and my own data were working with OpenCL
The users data set starts at I(Q~0.01) ~ 3e19, this may have something to do with the problem, as fitting is struggling.
Loaded and displayed fine on my laptop. With the LM and No OpenCL selected it runs a fit cycle, though returns a rubbish radius (<1). So reset the radius to ~75 and only fit the scale. That generated a reasonable fit. Then turned the radius back on and clicked Fit and it went straight back to the rubbish value (and lost the scale too). Reset the radius again and left it unchecked. Used GPU Options to change to my GPU (Intel UHD 630). Ran a fit cycle. Same behaviour but definitely no crashes. So the fiting problem might be ill-conditioned, but SasView was working.
User has confirmed that rescaling the data so it is numerically much smaller gets fits running with OpenCL on her machine.
I don't think that what Steve is seeing is "ill conditioned", I think making the data smaller in value will sort the issues. But we perhaps ought to be able to spot this ???
So do we think this is due to using GPUs with a large scale factor? so this would be a numerical issue because of the large number? In principle the math should be able to handle any scale factor I would think?
Scaling is not happening on the GPU, so this issue does not make any sense. I'm not able to reproduce it on mac with either Intel or AMD gpus.
I've been doing some playing around (so far only in 5.0.2). Attached are two datasets. In one I have multiplied all the intensity values by 1e17 (to make them comparable to Martina's values of ~e19). I then tried fitting the peak. My starting scale was 1 (the model default). Not until I manually entered a scale of 10000 (ie, within about 1e15 of the actual data) did SasView decide to get out of bed! This was with the LM. As I didn't have my GPU turned on, this is something more fundamental. scale_test.txt scale_test_x1e17.txt
Just tried 4.2.2. Same story.
You, beat me to this, I had already seen the same with the values for scale and flat background, using just a simple fit to sphere ( Win10). Though "scale" may not be involved in the gpu, the L-M fit routine won't know that "scale" is in our case special, so I think that there are either rounding errors or numbers nearly out of range or numbers "too close to zero" in the fitting algorithm.
As to why perhaps a particular gpu causes more issues than others is another matter!
The derivative is being computed numerically. It should be pretty well behaved if you are only fitting scale, but if you are fitting anything else besides then it will likely get wander a little bit lost. A small change in radius can have just as big an effect as a small change in scale when you are far from the minimum, so the space is pretty flat. You may even end up with singular matrices, depending on the numerical details.
I notice this with the latex_smeared example, with its radius of 2250ish. When I start with a radius of 50 it goes nowhere with L-M. DREAM is pretty reliable, unless I start with L-M first, in which case it takes a lot longer to escape from the local minimum.
Even if you are only fitting scale there may be problems due to various tolerances built into the algorithm. If the suggested step is small, or if the change in χ² is small after the step, then the algorithm will terminate.
It would be easy enough to "guess" a scale value by resetting it scale * I(x[i]) / y[i]
before the fit, either using i at the midpoint or doing some sort of geometric average over i, but I would rather not build too much knowledge of the model space into the GUI or the fitting engine.
User MartinaO, using Windows 10 on a machine with an Nvidia Quadro K1200, reported that on migrating from 4.2.2 to 5.0.2 fits of sphere model + stickyhardsphere with schultz polydispersity for the radius, were regularly crashing, especially when using DREAM.
4.2.2 had performed reliably.
On looking at the sasview logfile it was apparent that SasView was using the GPU, so I suggested she select No OpenCL in GPU Options. This apparently improved stability, but did not wholly cure the crashing.
@RichardHeenan then noted that his troublesome system featured two Quadro's, and that GPU Options in the GUI was not yet a reliable way to turn off the GPU. The user then verified the GPU was indeed off and said that stability had returned.
A Google search shows that the forums do contain a lot of traffic regarding OpenCL and Nvidia GPUs, including Quadros.
Q: Do 4.2.2 and 5.0.2 differ in their default approach to using a GPU if present?