ipfs-shipyard / py-ipfs

python implementation of ipfs
http://ipfs.github.io/py-ipfs/
MIT License
477 stars 94 forks source link

Minutes Development Meeting #0 1/11/2015 #20

Open BrendanBenshoof opened 8 years ago

BrendanBenshoof commented 8 years ago

Compatibility Target 0.4 go-ipfs

Start with implementing libp2p: https://github.com/ipfs/specs/blob/master/protocol/network/6-interfaces.md#63-swarm

BrendanBenshoof commented 8 years ago

uml

amstocker commented 8 years ago

some more links for libp2p: https://github.com/ipfs/js-ipfs#roadmap https://github.com/ipfs/specs/tree/master/protocol/network

jbenet commented 8 years ago

i strongly recommend following the module structure we have in go-ipfs and we'll have in js-ipfs. it will make it way easier for people to move between codebases, to translate swaths of functionality, and to identify problems and to fix them even if not native in that language.

jbenet commented 8 years ago

(that said if you want to experiment with something new, go for it. you should still follow all the specs though, including the fs-repo spec)

jbenet commented 8 years ago

that jerk @jbenet needs to write more specs.

amstocker commented 8 years ago

@jbenet That's pretty much the conclusion we came to. We basically decided to build a python-libp2p first and then build each service on top of it in a modular way. We'll probably be asking a lot of questions about the libp2p spec...

jbenet commented 8 years ago

perfect!

On Sun, Nov 1, 2015 at 10:17 PM, Andrew Stocker notifications@github.com wrote:

@jbenet https://github.com/jbenet That's pretty much the conclusion we came to. We basically decided to build a python-libp2p first and then build each service on top of it in a modular way.

— Reply to this email directly or view it on GitHub https://github.com/ipfs/py-ipfs/issues/20#issuecomment-152900830.

BrendanBenshoof commented 8 years ago

I am going to build a "sprint" todo (more of a jog than a sprint...) then close this issue.

Right now our big issue (as it allways is in python) is GIL. But @whyrusleeping pointed out a python 3.5 library that can help with that.

JulienPalard commented 8 years ago

I as OK top completly drop python 2 compatibility, but using async/await means dropping 3.1, 3.2, 3.3, and 3.4 compatibility. I like a lot async and await, but isn't being compatible with Python 3.5 a bit restrictive ?

JulienPalard commented 8 years ago

Also I'm not sur the GIL will be a problem for us, we're not trying to eat 100% of each CPU cores. Instead, we'll probably spead a lot of time waiting for the network.

JulienPalard commented 8 years ago

Why curio instead of asyncio ?

amstocker commented 8 years ago

@JulienPalard This is something we argued about for a while. It is my opinion that we should just stick with a single process with coroutines rather than do any multiprocessing. It looks like curio spawns a subprocess that it uses to handle all io asynchronously with callbacks, which looks nice, but I'm not sure we really need that. @BrendanBenshoof argued that we couldn't download multiple larges files at once, but I'm not yet convinced that this is actually true because if they are chunked downloads then we should easily be able to handle them asynchronously. However, we are still open to any suggestions and none of this is yet finalized so keep the questions coming.

JulienPalard commented 8 years ago

Downloading a large file didn't mean a single, huge, blocking read() syscall. read() can (and have to) be broken in multiple calls, and the underlying socket can be set asynchronous.

On the other hand, using another thread to fetch the file won't directly solve the "this file is huge, the process is blocked", I mean, the main process will not be blocked, but what about the second one being "stuck downloading the huge file", suspending every network request from the main thread.

Finally, according to the asyncio documentation it already works on non-blocking file descriptors, so we don't have a problem here, as far as I can see.

candeira commented 8 years ago

I'm sorry I missed the dev meeting, here's my take:

JulienPalard commented 8 years ago

I also missed the meeting, but can someone expose the pro/cons for curio ? I clearly don't get it.

amstocker commented 8 years ago

@JulienPalard curio uses py3.5 coroutines and opens a process pool to handle concurrent I/O. I'm not convinced we need it, but it would unload some of the I/O when using large files.

JulienPalard commented 8 years ago

@amstocker Got it that it uses a separate thread / process, but I'm not convinced too we need it: How inter-process communication does cost less than no-interprocess-communication ? As long as we're doing asynchronous reads / writes in the network (what asyncio does), we're OK.

On the other hand, asyncio is backported down to Python 3.3 as a module, and is part of the std lib in 3.4, read: more developpers will master it.

So I'm +1 for asyncio, and -0 for curio, my -0 will become a -1 if I don't read any really valuable argument for curio.

JulienPalard commented 8 years ago

I'm going to -1 for curio as it's not compatible with asyncio, so if we need to access anything like HTTP, SQL, a message queue, whatever, and if this thing have an asynchronous implementation compatible with asyncio (more and more probable), we won't be able to use it until a curio implementation exist (less probable than an asyncio implementation as asyncio is in the std lib).

amstocker commented 8 years ago

@JulienPalard Agreed. I also think that if we really need to, we can implement similar functionality fairly easily using asyncio.