maidsafe-archive / MaidSafe

This is the super-project in which each MaidSafe library resides. Some information is common to all libraries, and is detailed here. Library-specific information can be found in each library's wiki.
Other
583 stars 101 forks source link

Time out on storing too many keys(400) on a small local network (size 12). #91

Closed chandraprakash closed 11 years ago

chandraprakash commented 11 years ago

Setup 12 Vaults (manually or using vaults.py) Run pd_keys_helper -c -n 400 pd_keys_helper -ls --peer system_ip:5483

(Several RPC timeouts will be seen at pd_keys_helper's info level log for PD)

This seems to look like PD issue. Found out that routing is delivering message in less than 1 second time to the destination nodes. But PD is not calling the reply functor with the response. So after 10 seconds, the RPC is timing out.

dirvine commented 11 years ago

This is very likely the RCS again. I think we need to strip the complexity out of this component very quickly. Fraser is working on a potential replacement for this in line with some work I am doing with the nfs project. We would hope this is progressed very far by Wednesday (design at least) and should be implemented from there at great pace. It's all about simplifying the existing vault code dramatically and introducing type safety. The number of mutex locked queues and parallelism code that is likely not parallel is very high as it stands. If possible we should get somebody to strip out all buffering / queueing from the remote chunk store immediately, I am very confident routing and rudp can handle significantly more traffic that they are getting just now as the vaults do effectively throttle. As it is, if we just have the RCS pass data straight through to routing I feel sure it will be a massively faster system. This means removing the "ops" structures that exist there (all of them).