open-source-ideas / ideas

💡 Looking for inspiration for your next open source project? Or perhaps you've got a brilliant idea you can't wait to share with others? Open Source Ideas is a community built specifically for this! 👋
6.58k stars 221 forks source link

The One 3 tier extendable package manager to rule them all #50

Open KOLANICH opened 6 years ago

KOLANICH commented 6 years ago

There are lot of package managers, there are lots of package formats, there are lots of repositories, they have duplicated packages which can conflict with each otner and break things, some of them may have security issues. It's mess and Bedlam. We need to get rid of this. But not to produce the situation like in xkcd #927 Standards.

The roots of the problem is that there are different environments, and package managers should be adapted to them. Usually OS developers create own package managers. But there may be a better way.

Project description

Package managers and their packages are similar to each other, they are differrent in some details: interface and package format.

So the solution is a package manager of 3 tiers: 1 fronend: command line interface, language bindings, reading packages formats and invoking operations 2 middleware 3 backend: a plugin actually doing the operations in the way adapted to the OS

The middleware is a package manager itself. Frontend and backend are plugins.

It shouldn't have an own package format. Instead it should have frontend plugins reading all the packages pormats of all other package managers. The vendors and users won't bother transforming their paskages. Having a compatible interface should mean it to be a drop-in replacement - just install One PM instead the native one and you are migrated.

It shouldn't have an own interface, instead it should have frontend plugins transforming the interfaces of other package managers.

It shouldn't be tied rigidly to the environment. Instead the environment developers should only create a backend plugin doing the things in the way they have mean them to be done.

npm bower cargo pecl pear composer conda conan nix guix pip(sdist, egg, wheel, etc) portage rpm apt [io]pkg chocolatey NuGet OneGet c[ptr]an haxelib apk homebrew
                                 middleware
ubuntu gentoo Arch debian Android slackware FreeBSD openwrt windows GoboLinux MacOS guix

The frontend tells the package manager what to do. The backend knows how exactly should this be done. The middleware keeps track of the dependencies.

There are differrent packages of the same program in differrent repos. The middleware should be able to detect the packages for the same purpose. For example this can be imlemented as a mapping (repo1, id1) -> the global id of the program <- (repo2, id2) stored into a database which is updated manually (the similarities are detected automatically, but a human takes decision).

Here is a draft of the architecture:

using IdT = uint64_t;
struct IIdentified{ //< our restricted hand-crafted RTTI with hardcoded identifiers
    virtual IdT getTypeId() = 0;
};
template <IdT typeId, typename T> struct Identified: public IIdentified, public T{
    virtual IdT getTypeId(){return typeId} override;
};
struct IActionProvider: public IIdentified{ //< Actually does action
    virtual IdT getActionType() = 0;
    virtual bool do(IAction &act) = 0;
};
template<IdT actionTypeId> struct ActionProviderStub: public IActionProvider{
    virtual IdT getActionType(){return actionTypeId} override;
};
struct IManager: public IIdentified{ //< discovers, downloads and manages packages.
    virtual void getSourceTypes(collection<IdT> &res) = 0;
    virtual void installed(collection<IPackage> &res) = 0;
    virtual void discover(ISource &src, collection<IPackage> &res) = 0;
    virtual void download(ISource &src, collection<IPackage> &pkgs, collection<IDownloadedPackage> &res, bool fullness) = 0; // if fullness == 0 the 
};
struct ISource: public IIdentified{ //< stores the info about a source
    std::string name; // doesn't contain source type because source type is identifed by IdT
...
};
struct Metadata{
    std::span<UnresolvedPackage> dependencies; // nullptr if not available without actual downloading the package
    ....
};
struct UnresolvedPackage{
    IManager *manager; // can be nullptr. Used to indicate the preferred manager to resolve the package.
    std::string name;
};
struct IPackage: public IIdentified{
    UnresolvedPackage *id;
    ISource *source;
    virtual Metadata getMetadata() = 0;
};
struct IDownloadedPackage: public IIdentified{
    IPackage *identified;
    virtual void getActions(collection<IAction> &res) = 0;
};
struct IAction: public IIdentified{};
struct IUnpackAction: public IAction{ //< used to get files objects
    virtual void getTrees(collection<IFile> &res) = 0;
};
struct IFile{
    enum class Kind: uint8_t{
        file = 0,
        dir = 1,
        link = 2
    };
    union{...} payload;
};

struct IDependencyAction: public IAction{//< used for discovery (in the case they are not known before downloading) and installation of dependencies
    virtual void getDependencies(collection<UnresolvedPackage> &res) = 0;
};
struct IRegisterAction: public IAction{ //< used for registratikn of the package inside of a specific package managers
    virtual void register(IDownloadedPackage &pkg) = 0;
};

Relevant Technology

Who is this for

Lennart @poettering

ghost commented 6 years ago

This is truly the case. But can be fixed, I am willing too work on it .

FredrikAugust commented 6 years ago

@muhammad-haroon are you working on this?

ghost commented 6 years ago

Hi @FredrikAugust to be honest this is just in planning phase, i wanted to discuss my approach with someone so that i can get critique, and in turn start developing it. the approach i want to take is of the project busybox and start from the most minimal basic, small package manager. that focuses on stability, security and easy upgrade-ability. if you guys like we can start with a basic design and start prototyping parts of it.

ghost commented 6 years ago

the issues i see with having to work with the existing package managers is that there development libraries are not easily useable and are not well documented at all, and i speak this from the perspective of a user of debian and looking at libdpkg-dev , i wanted to explore basic functions of the library and it turned out that the most useful of functions are hidden and not exposed publicly , if any one knows how to use these, i would also like to learn from there experience and start working on what ever you guys suggest . the real motive is to fix this issue .

sorki commented 6 years ago

The answer (at least for me) is Nix which is a language agnostic package manager and is already used to wrap number of language-specific ecosystems (haskell, python, lua, perl, php.. - check the lists here https://github.com/NixOS/nixpkgs/tree/master/pkgs/top-level).

It is also possible to install Nix on most distributions and I strongly recommend doing so as it solves a lot of issues with traditional package managers.

Conan-Kudo commented 5 years ago

Isn't this basically what PackageKit does?

KOLANICH commented 5 years ago

@Conan-Kudo, thanks for the info. PackageKit definitely does a part of the job: it provides unified user interface. But it seems it doesn't act as a middleware.

I mean the following. In order to install a package correctly into a distro we cannot use package managers as black boxes doing installation. Quite the opposite, installation should be done by something that can do it. In our terminology it is a backend. A backend gets the stream of operations this file is docs, install it, this file is a lib, install it, this file is a header, install it, you have just installed the package %packagename% of version %ver% from %source%, consider it installed, we need a dependency %dependencyname% of version satisfying %constraints% and does them in a distro-specific way.

For example let's assumme we are on Ubuntu and wanna install https://pypi.org/project/xgboost/ . It depends on numpy. But we don't want to install numpy from pypi. We want prefer packages from Ubuntu repos because we trust Ubuntu maintainers more than the stuff from pypi which is not even signed. There should be a config file somewhere mapping python3-(?P<name>\w+) in Ubuntu Universe to (?P<name>\w+) in pypi and stating that packages from ubuntu universe are preferred over ones from pypi even if it means not using the latest version. Since in pip packages dependencies are known only after package download, before-package-download dependency resolution fails with the code settjng that. pip frontend downloads the package and when installing emits the action install numpy package from pip. The middleware processes the action. It checks that there is a rule mapping pip packages to ubuntu ones, and since ubuntu ones are preferred, it converts the name to python3-numpy and asks the apt frontend to resolve the dependencies. For apt the dependencies can be resolved and conflicts can be detected before package downloading. The dependencies are passed to middleware. Fortunately there were no conflicts. The frontend downloads the package. It emits operations to copy the needed files and install manpages. The middleware splits them into 2 ops. 1 installs a file. Another one runs post-installation actions. They are passed to debian backend which puts them to appropriate directories and registers them into integrity (s.a. debsums), search (s.a. the one used by dpkg -L) and alternatives (update-alternatives) databases, if needed. The frontend also emits the operation to consider python3-numpy installed. It is passed to the middleware and the middleware marks it in its own db as installed. Then it installs the rest of dependencies the similar way. Packages in any repo can depend on packages in any other repo without making dependency resolution system unhappy.

Target distros can be swapped by changing a backend or its settings. Siupport for distros packages can be added by adding a frontend or configuring an existing one. For example for d1stri the backend can create an image and do actions within it.