Open BrendanBenshoof opened 8 years ago
some more links for libp2p: https://github.com/ipfs/js-ipfs#roadmap https://github.com/ipfs/specs/tree/master/protocol/network
i strongly recommend following the module structure we have in go-ipfs and we'll have in js-ipfs. it will make it way easier for people to move between codebases, to translate swaths of functionality, and to identify problems and to fix them even if not native in that language.
(that said if you want to experiment with something new, go for it. you should still follow all the specs though, including the fs-repo spec)
that jerk @jbenet needs to write more specs.
@jbenet That's pretty much the conclusion we came to. We basically decided to build a python-libp2p first and then build each service on top of it in a modular way. We'll probably be asking a lot of questions about the libp2p spec...
perfect!
On Sun, Nov 1, 2015 at 10:17 PM, Andrew Stocker notifications@github.com wrote:
@jbenet https://github.com/jbenet That's pretty much the conclusion we came to. We basically decided to build a python-libp2p first and then build each service on top of it in a modular way.
— Reply to this email directly or view it on GitHub https://github.com/ipfs/py-ipfs/issues/20#issuecomment-152900830.
I am going to build a "sprint" todo (more of a jog than a sprint...) then close this issue.
Right now our big issue (as it allways is in python) is GIL. But @whyrusleeping pointed out a python 3.5 library that can help with that.
I as OK top completly drop python 2 compatibility, but using async/await means dropping 3.1, 3.2, 3.3, and 3.4 compatibility. I like a lot async and await, but isn't being compatible with Python 3.5 a bit restrictive ?
Also I'm not sur the GIL will be a problem for us, we're not trying to eat 100% of each CPU cores. Instead, we'll probably spead a lot of time waiting for the network.
Why curio
instead of asyncio
?
@JulienPalard This is something we argued about for a while. It is my opinion that we should just stick with a single process with coroutines rather than do any multiprocessing. It looks like curio
spawns a subprocess that it uses to handle all io asynchronously with callbacks, which looks nice, but I'm not sure we really need that. @BrendanBenshoof argued that we couldn't download multiple larges files at once, but I'm not yet convinced that this is actually true because if they are chunked downloads then we should easily be able to handle them asynchronously. However, we are still open to any suggestions and none of this is yet finalized so keep the questions coming.
Downloading a large file didn't mean a single, huge, blocking read()
syscall. read()
can (and have to) be broken in multiple calls, and the underlying socket can be set asynchronous.
On the other hand, using another thread to fetch the file won't directly solve the "this file is huge, the process is blocked", I mean, the main process will not be blocked, but what about the second one being "stuck downloading the huge file", suspending every network request from the main thread.
Finally, according to the asyncio documentation it already works on non-blocking file descriptors, so we don't have a problem here, as far as I can see.
I'm sorry I missed the dev meeting, here's my take:
python 3.x where x < 5
I also missed the meeting, but can someone expose the pro/cons for curio
? I clearly don't get it.
@JulienPalard curio
uses py3.5 coroutines and opens a process pool to handle concurrent I/O. I'm not convinced we need it, but it would unload some of the I/O when using large files.
@amstocker Got it that it uses a separate thread / process, but I'm not convinced too we need it: How inter-process communication does cost less than no-interprocess-communication ? As long as we're doing asynchronous reads / writes in the network (what asyncio does), we're OK.
On the other hand, asyncio
is backported down to Python 3.3 as a module, and is part of the std lib in 3.4, read: more developpers will master it.
So I'm +1 for asyncio
, and -0 for curio
, my -0 will become a -1 if I don't read any really valuable argument for curio.
I'm going to -1 for curio
as it's not compatible with asyncio, so if we need to access anything like HTTP, SQL, a message queue, whatever, and if this thing have an asynchronous implementation compatible with asyncio (more and more probable), we won't be able to use it until a curio implementation exist (less probable than an asyncio implementation as asyncio is in the std lib).
@JulienPalard Agreed. I also think that if we really need to, we can implement similar functionality fairly easily using asyncio.
Compatibility Target 0.4 go-ipfs
Start with implementing libp2p: https://github.com/ipfs/specs/blob/master/protocol/network/6-interfaces.md#63-swarm