Closed thinxer closed 5 years ago
@thinxer while I can't reproduce the fail again in Mac. It's weird. I run about 50 times manually, all passed. Have you updated with the upstream?
have you merged the pagerank test with your multithread code to test? On May 17, 2013 6:32 PM, "Wei Chen" notifications@github.com wrote:
@thinxer https://github.com/thinxer while I can't reproduce the fail again in Mac. It's weird. I run about 50 times manually, all passed. Have you updated with the upstream?
— Reply to this email directly or view it on GitHubhttps://github.com/THUKEG/saedb/issues/64#issuecomment-18054335 .
@thinxer Yes. Weird.
@thinxer you try to test on multithraed branch too?
I just merged your thread_pool
branch with upstream/master
, and the pagerank_test
wouldn't pass.
I reproduced this error today.
Weird.
Could you try my branch directly? just checkout to my thread-pool branch and test.
Wei Chen ipondering.me
On 2013年5月20日Monday at 上午11:05, Jianfei Wang wrote:
I just merged your thread_pool branch with upstream/master, and the pagerank_test wouldn't pass. I reproduced this error today.
— Reply to this email directly or view it on GitHub (https://github.com/THUKEG/saedb/issues/64#issuecomment-18130304).
Well, you don't have pagerank_test on your branch...
Oops.
My master branch is up-to-date, non-threading and has pr_test.
I run : repeat 100 ./pagerank_test. No error.
Wei Chen ipondering.me
On 2013年5月20日Monday at 上午11:13, Jianfei Wang wrote:
Well, you don't have pagerank_test on your branch...
— Reply to this email directly or view it on GitHub (https://github.com/THUKEG/saedb/issues/64#issuecomment-18130472).
It's correct because it's single-threaded.
Only with multi-threading can the problem be found, since the execution order of vertex programs is not stable.
On Mon, May 20, 2013 at 11:52 AM, Wei Chen notifications@github.com wrote:
Oops.
My master branch is up-to-date, non-threading and has pr_test.
I run : repeat 100 ./pagerank_test. No error.
Wei Chen ipondering.me
On 2013年5月20日Monday at 上午11:13, Jianfei Wang wrote:
Well, you don't have pagerank_test on your branch...
— Reply to this email directly or view it on GitHub ( https://github.com/THUKEG/saedb/issues/64#issuecomment-18130472).
— Reply to this email directly or view it on GitHubhttps://github.com/THUKEG/saedb/issues/64#issuecomment-18131199 .
Yes, I'll fix that.
I suppose it's because the implementation of thread_pool's join. Since all applys are executed after gathers, before scatter, even change vertex data is isolated from other vertex.
Wei Chen ipondering.me
On 2013年5月20日Monday at 下午12:04, Jianfei Wang wrote:
It's correct because it's single-threaded.
Only with multi-threading can the problem be found, since the execution
order of vertex programs is not stable.On Mon, May 20, 2013 at 11:52 AM, Wei Chen <notifications@github.com (mailto:notifications@github.com)> wrote:
Oops.
My master branch is up-to-date, non-threading and has pr_test.
I run : repeat 100 ./pagerank_test. No error.
Wei Chen
ipondering.me (http://ipondering.me)On 2013年5月20日Monday at 上午11:13, Jianfei Wang wrote:
Well, you don't have pagerank_test on your branch...
—
Reply to this email directly or view it on GitHub (
https://github.com/THUKEG/saedb/issues/64#issuecomment-18130472).—
Reply to this email directly or view it on GitHubhttps://github.com/THUKEG/saedb/issues/64#issuecomment-18131199
.— Reply to this email directly or view it on GitHub (https://github.com/THUKEG/saedb/issues/64#issuecomment-18131411).
this branch is trying to fix it: https://github.com/pondering/saedb/tree/fix-syn-engine .
now found exeGather
may have problem. But I suppose it's the problem of OS's memory mapped file.
It's working on Linux.
On Mon, May 20, 2013 at 5:03 PM, Wei Chen notifications@github.com wrote:
this branch is trying to fix it: https://github.com/pondering/saedb/tree/fix-syn-engine .
now found exeGather may have problem. But I suppose it's the problem of OS's memory mapped file.
— Reply to this email directly or view it on GitHubhttps://github.com/THUKEG/saedb/issues/64#issuecomment-18138136 .
what do you mean by "working"?
I comment out all parallel code except executeInits
so I can inspect suspicious part one by one. While I can't find the bug anyway in the executeInits
function. when I run ./pagerank_test 5000 times, there is still 1 failure.
do you have any idea about this issue?
Wei Chen ipondering.me
On 2013年5月20日Monday at 下午5:55, Jianfei Wang wrote:
It's working on Linux.
On Mon, May 20, 2013 at 5:03 PM, Wei Chen <notifications@github.com (mailto:notifications@github.com)> wrote:
this branch is trying to fix it:
https://github.com/pondering/saedb/tree/fix-syn-engine .now found exeGather may have problem. But I suppose it's the problem of
OS's memory mapped file.—
Reply to this email directly or view it on GitHubhttps://github.com/THUKEG/saedb/issues/64#issuecomment-18138136
.— Reply to this email directly or view it on GitHub (https://github.com/THUKEG/saedb/issues/64#issuecomment-18139942).
You mean that even the single-threaded engine has a hard-to-reproduce bug? (1 in 5000)
I'm not sure.
Single-threaded program should not have the problem, and I don't find a failure at least now.
But when I just make executeInits
threading, there will be failure.
Wei Chen ipondering.me
On 2013年5月21日Tuesday at 上午11:59, Jianfei Wang wrote:
You mean that even the single-threaded engine has a hard-to-reproduce bug?
— Reply to this email directly or view it on GitHub (https://github.com/THUKEG/saedb/issues/64#issuecomment-18187688).
Results by the synchronous engine should be the same whether run single-threaded or multithreaded, since it reads node data only from the last iteration. However this is not the case for the page rank test. This indicates that the synchronous engine is not correct. Please investigate this problem.