ipfs-shipyard / py-ipfs

python implementation of ipfs
http://ipfs.github.io/py-ipfs/
MIT License
476 stars 94 forks source link

Proposal: Seperate packages for each system #4

Open NoraCodes opened 8 years ago

NoraCodes commented 8 years ago

I suggest that we separate the ipfs package into packages for each IPFS subsystem: * ipfs-api - Used to connect to Py-IPFS by applications. This is where most users should touch. From https://github.com/ipfs/python-ipfs-api * ipfs-naming or ipfs-ipns - Provides access to the IPNS pulbic key infrastructure and naming system. Handles HRNs and mutable pointers * ipfs-merkledag - Provides access to Merkledag for resolving paths * ipfs-block - Provides access to the block exchange system for transferring data. * ipfs-routing - Provides access to the routing system, probably mostly DHT and mdns * ipfs-network - Provides access to the lowest level IPFS network transport functions, like NAT traversal/hole punching, encryption/signing, and multi/broadcasting

Client applications would initialize the stack from the bottom up: choose a transport and encryption with ipfs-network, connect to the DHT with ipfs-routing, then use ipfs-naming (ipfs-ipns) to find the location of desired content and ipfs-merkledag and ipfs-block to recall that data. On the other hand, if an instance of IPFS is already running on the system (which is preferred), they could simply use ipfs-api to interact with it.

BrendanBenshoof commented 8 years ago

I am a fan of using multiple packages, but I suggest doing so internally and not polluting the global namespace or a package manager with them. This is a standalone application, not a package.

ipfs.api and naming already exists in the form of https://github.com/ipfs/python-ipfs-api and since they are universal (can be used with any ipfs implementation) it does not make sense to put it in the same package/namespace as this project.

jbenet commented 8 years ago

I think the more standalone packages that can be used by other things, the better. (Yes published to package managers). A convention could be used like in node ipfs: ipfs-<module>

In particular, all the network stuff, definitely make it separate packages. Name the main one libp2p -- and check with @diasdavid and I on it -- we're actively modularizing it and making the pieces even more useful

— Sent from Mailbox

On Tue, Oct 6, 2015 at 9:07 AM, BrendanBenshoof notifications@github.com wrote:

I am a fan of using multiple packages, but I suggest doing so internally and not polluting the global namespace or a package manager with them. This is a standalone application, not a package.

ipfs.api and naming already exists in the form of https://github.com/ipfs/python-ipfs-api and since they are universal (can be used with any ipfs implementation) it does not make sense to put it in the same package/namespace as this project.

Reply to this email directly or view it on GitHub: https://github.com/ipfs/py-ipfs/issues/4#issuecomment-145887573

daviddias commented 8 years ago

I believe it would be also good to align organisation conventions across implementations (e.g. node-ipfs-bitswap, go-ipfs-bitswap, py-ipfs-bitswap and others all do the same thing and offer the same interface). So on this, I happy to bring more libp2p context, the IPFS network stack.

NoraCodes commented 8 years ago

@diasdavid Would we then use submodules for those packages into this repository? My intent was to allow user-side or project-side swapping of each subpackage for either alternate Python implementations or for C, Go, or other implementations, for speed, while not breaking the API.

daviddias commented 8 years ago

@SilverWingedSeraph the Node.js implementation is breaking everything into modules per repo https://github.com/ipfs/node-ipfs#roadmap. modularity and composability ftw :)

NoraCodes commented 8 years ago

OK, I updated my comment to reflect the naming convention, and the proper use of ipfs-api. I like the idea of using modules per repo.

BrendanBenshoof commented 8 years ago

We have solved a lot of the dht related problems over at UrDHT. The latest Kademlia DHT logic has not left the simulator and been implemented into UrDHT yet, but we can use it soon. I should test if my kademlia peer selection criteria plays well with the other nodes using the k-buckets approach

NoraCodes commented 8 years ago

@diasdavid To clarify, do you mean we should create additional repos and use git submodules, or that we should make Python modules for each section in this repository? Sorry, the word "module" is a bit unclear here.

daviddias commented 8 years ago

@SilverWingedSeraph I was pointing out that on the same train of thought that you proposed with I suggest that we separate the ipfs package into packages for each IPFS subsystem, we started to modularize the IPFS for the Node.js implementation and we intend to go through a similar process in the go implementation.

It would be good for the ecosystem to have the same level of modularity across implementations, so that a user of 'libp2p' in Node.js can have the same expectations when using its 'go', 'python', 'rust', etc implementations.

Now if the best option is to create several repos vs git submodules vs python modules, I believe that depends on what is defined as best practice for the python ecosystem (I honestly don't know what is the convention for Python).

NoraCodes commented 8 years ago

I think it basically depends on whether or not we want to allow people to use our functionality piecemeal. If we want, say, ipfs-naming to work standalone, we should create separate repos and use submodules to bring them here. Otherwise, we can just use a structure like

ipfs/
    __init__.py
    ipfs-naming/
        __init__.py
    ipfs-routing/
        __init__.py
    ...
sirMackk commented 8 years ago

Correct me if I'm wrong @diasdavid , but I believe the closest translation of what is happening in the nodejs implemention to what we could do here would be:

We can use this repository as the mothership and have it contain git submodules/subtrees.

I'm still reading through the go-ipfs code so my knowledge of the project is somewhat shoddy, but to me it looks like it'd be cool to split the functionality into multiple repositories and have each piece work in a stand-alone fashion for that ease of use in other projects.

mvanveen commented 8 years ago

I'm worried about how this would affect imports, e.g.:

>>> from test.ipf-test import foo
  File "<stdin>", line 1
    from test.ipf-test import foo
SyntaxError: invalid syntax

The issue is - is not a valid identifier. If we want to preserve uniformity with the aesthetic of other repos I think this is a deal breaker.

My read on the general in the greater Python community is that it seems like modularizing sub-packages into separate repositories is fairly rare, possibly due to the perceived overhead with git submodules. Python also favors a lot of imports between sub-packages (as long as they aren't circular references :smirk:), so there aren't as many advantages to this structure out in the wild generally. I believe in JS and golang that sort of encapsulation offers more benefits.

The python-ipfs-api project seems to favor this pattern as well. Everything is currently namespaced into a master ipfsApi package.

I'm not completely against a separate repo, but I'm personally with @BrendanBenshoof and think we should keep everything in one repo.

Edit: re-reading, I'm not sure if @BrendanBenshoof is advocating for or against git submodules. I favor sub-package (i.e. Python package) encapsulation, but believe git submodules are probably unnecessary/more trouble than they're worth in this case.