openworm / sibernetic

This is a C++/OpenCL implementation of the PCISPH algorithm supplemented with a set of biomechanics related features applied to C. elegans locomotion
Other
358 stars 106 forks source link

From time to time appearing strange behaviour #60

Closed skhayrulin closed 9 years ago

skhayrulin commented 9 years ago

Sometime Sibernetc shows strange behaviour when start: explosion, fluctuation of springs etc. Looks like race condition somewhere.

ignotur commented 9 years ago

What do you color-coded as blue dots which still inside the box in wrong simulation?

skhayrulin commented 9 years ago

blue it's a water particles, black is a elastic particles, red lines are elastic connections, meanwhile blue triangles are membranes.

skhayrulin commented 9 years ago

@ignotur branch for this bug is here

ignotur commented 9 years ago

I have started the program more than 30 times and I didn't manage to catch this bug. All times the cube in the centre formed a sphere and then it started to fall down.

skhayrulin commented 9 years ago

What is you OpenCL device and what version of OpenCL driver do you use? Actually I saw this problem not only on my machine as I remember @pgleeson and @a-palyanov had the same problem on their machines.

ignotur commented 9 years ago

CL_PLATFORM_VERSION [0]: OpenCL 1.2 (Dec 14 2014 22:29:47) CL_CONTEXT_PLATFORM [0]: CL_DEVICE_NAME [0]: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz

pgleeson commented 9 years ago

My info: CL_PLATFORM_VERSION [0]: OpenCL 1.2 AMD-APP (1214.3) CL_CONTEXT_PLATFORM [0]: CL_DEVICE_NAME [0]: Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz

I sometimes get:

selection_633

selection_634

ignotur commented 9 years ago

Could you please be more specific. Sometimes is one out of 10 runs or less frequent?

pgleeson commented 9 years ago

Worked fine first time I ran Sibernetic today, is happening every time I run it now. Generally a restart resolves the problem. Haven't looked into this in a while...

skhayrulin commented 9 years ago

On my machine it happen from time to time I can't make any predictions when or calculate frequency. After some research I can say that probably the problem is in elastic force calculation kernel here especially in this line, somehow in first iteration when it's calculating impact of elastic force into acceleration there are big numbers is appearing and acceleration becomes too big and leads to explosion finally. So we need find where and why this numbers is appearing. I'll try to publish more info about my research tomorrow.

ignotur commented 9 years ago

Ok, at the first glance for me it seems unavoidable that sometimes this condition is not enough if(r_ij!=0.f). Probably it is better to write if (abs(r_ij) > some_small_number) What do you think?

skhayrulin commented 9 years ago

I think that it doesn't explain why explosion happens actually I didn't see too small value of r_ij when I debugged a code. And you can see that r_ij is needed only for normalization of vector vect_r_ij, sorry may be it's not obvious we use here specific OpenCL objects vector (float4) so vect_r_ij/r_ij means delete each component off vector on scalar r_ij. If yo want I can help you with introduction in OpenCL.

skhayrulin commented 9 years ago

I made two experiments description: I took two run one is correct (without explosion) and incorrect, log info can see here, also I comment line and I also have two run. Actually I also saw an explosion for second experiment from what I can say that problem not in elastic force calculation but in something else. You can see logs below I compare right run with wrong analogs on first picture line is on in second it's off. right_vs_wrong_acc_changing

correct_vs_no_acc_nochanging So you can see that after first iteration of do while cycle everything is allright but on next step some strange value in acceleration[id] is appearing. Last code at the repo. I'm still working on it