Raku / ecosystem

Raku ecosystem – modules and more
https://modules.raku.org/
139 stars 152 forks source link

RFC Native Dependency Specs for META6.json #334

Closed samcv closed 4 years ago

samcv commented 7 years ago

This is an RFC for adding Native dependency fields to META6.json. This proposal is meant to hopefully be extensible, and be similar in character to how the metadata is already stored and laid out.

It is meant to be mostly distro neutral.

bin # meant to be findable in the path and be the name of the binary
lib # meant to be something such as a .so object
name # human readable name

A package manager could confirm that something is installed in a distro neutral way by checking the binary and lib paths.

Optionally you could also include package manager specific packaging names, and for Windows since the files may not be found in a binary or library path, the actual path of the installed program/lib could be given. The 'default' key is meant to be a default that should be attempted if the distros package manager is not listed. May want to use a different name for this, and more input on this because it's kind of ambiguous what it is the default of, though it was intended to be the default to check on a linux distribution, so the default option may not be good to include in the first specification for a native-dependencies standard.

Any comments on this proposal would be great. Me and @the-eater discussed it on IRC and hopefully we can get even more input on this from the community at large.

For packages for example python packages, that could be listed as the package manager being 'pip' or something akin to that and let the perl6 package manager even ask to install these dependencies. This would hopefully create a much better user experience ultimately, and in the more short term, let the package manager let the user know that the dependency may not be installed, and give them its name and other useful information.

  "native-dependencies": [
    {
      "name": "curl",
      "bin": [
        "curl"
      ],
      "lib": [
        "libcurl"
      ],
      "support": {
        "site": "https://curl.haxx.se/"
      },
      "version": "4.0+",
      "package-manager": {
        "default": {
          "curl": "*"
        },
        "apt": {
          "curl": "4.0+"
        }
      },
      "distribution": {
        "windows": {
          "location": {
            "bin": {
              "curl": [
                "C://Program Files/curl/curl.exe"
              ]
            },
            "lib": {
              "libcurl": [
                "C://Program Files/curl/library/curl/curl32.dll"
              ]
            }

See the full sample JSON file here: https://gist.github.com/samcv/ed59c2bf070c3c544be25fe199f63881

the-eater commented 7 years ago

One of the reasons of this proposal is #333

JJ commented 7 years ago

:+1:

FCO commented 7 years ago

:+1:

zoffixznet commented 7 years ago

I don't get what the stuff in "windows" represents. What are those paths?

the-eater commented 7 years ago

@zoffixznet

Windows doesn't have a unified location for libraries, so this is pure hinting for the user and or the package manager where the dll may be located, in other distribution items (say ubuntu or solaris) the package-manager may also be overridden

zoffixznet commented 7 years ago

so this is pure hinting for the user

I guess "%PROGRAMFILES%/curl/library/curl/curl32.dll" is better.

I know @shadowcat-mst was (planning to be) working on a similar thing for Perl 5; may be worth asking him for opinions on this.

the-eater commented 7 years ago

Hmm, indeed a better option :)

I'm looking forward to what @shadowcat-mst has to say about this :)

timo commented 7 years ago

i'd suggest "fallback" instead of "default" for "we don't know the name for a given package manager".

the-eater commented 7 years ago

@timo

Altho I understand what you mean, it's not exactly what "default" means currently at this point it's "this is the default name of this package and most package managers named it so". as the example given, curl is almost in every distribution called curl.

niner commented 7 years ago

Closely related: https://github.com/perl6/toolchain-bikeshed/blob/master/build.md

niner commented 7 years ago

"For packages for example python packages, that could be listed as the package manager being 'pip' or something"

pip or apt are mechanisms for getting packages, not requirements of your dist. Your dist needs a certain Python package, so the meta data should say so. It should not specify how to get that package as then every single author who needs a Python or whatever package would need to know how to install those. And when the recommended way changes, we'd have to update all those META files or even worse, start cheating, by interpreting a "'pip': 'curl'" like the "'python': 'curl'" it should have been in the first place and use a different package manager anyway.

timo commented 7 years ago

@the-eater okay, so maybe "common-name" instead of "default"?

the-eater commented 7 years ago

@niner so you're suggesting instead of defining package managers, defining distros?

the-eater commented 7 years ago

@timo sounds good to me :)

jonathanstowe commented 7 years ago

Totally in favour of the principle of this. I guess about a third of my modules have some external dependencies. There are a couple that this won't help (not actually packaged dependencies most places,) but I have an alternative plan for those.

But yes I am with @niner, it should be as unspecific about the how something should be installed as possible, leaving the choice to the installer based on the type of thing (native package, python or perl 5 module etc) and the platform. Not only does it mean that this only needs to be changed in one place if for some reason the mechanism changes (and they do,) it also doesn't tie the hands of the implementor of the installed to supporting a particular mechanism at any time (there are platforms where there are several and it is largely a matter of taste and fashion.) An additional side benefit of this would be to make "plugins" for the installer easier. I may be the only author currently who has a dependency on an ocaml application which is rarely packaged at the right version, I might well be inspired to write a plugin for that (maybe using opam, maybe not) whereas the author of the installer probably wouldn't be bothered (and the other users don't really need the cruft unless they want my module....)

JJ commented 7 years ago

Alien::Base https://github.com/Perl5-Alien/Alien-Base by @perl5-alien is the closest thing in the Perl5 realm, I guess.

ugexe commented 7 years ago

I think this may be too complex as it is, and will naturally get much more complex very quickly. For instance: what does 4.0+ mean to each package manager, and how do you represent this logic to each of them such that they can understand it?

Now consider https://github.com/perl6/toolchain-bikeshed/blob/master/build.md Deceptively simple so far. There is little spec that is defined, so its easy to expand - you can check for whatever file you want, not just a lib or bin. And it might not be far fetched for a simple implementation to be integrated into rakudo's Distribution.

niner commented 7 years ago

@the-eater: instead of how to install something the META data should list the dependency. If it needs a Python package, it should say so and let the toolchain care about the rest (the absolute minimum being "Please ensure that the "foobar" Python package is installed". Same is true for other parts. If you need a curl binary, say so: 'bin': 'curl' If you need libcurl.so on Linux or curl.dll on Windows: 'lib': 'curl'

A modern Linux package manager like zypper can already search for capabilities in addition to just plain package names. You can zypper in firefox to get the "FireFox" package installed. Or you can zypper in 'perl(Petal::Utils)' to get that Perl 5 module. Or zypper in libcurl.so.4 for the libcurl4 package.

Why tell the toolchain to install a Python package via pip, when there's an rpm package available for the distro which is easier and faster to install. Or if the user is buying into the container stuff, there's probably some other way some container manager can ensure for you that a certain library will be available.

jonathanstowe commented 7 years ago

@JJ I've had push-back in the past when I've suggested an Alien like mechanism for Perl 6, but I have at least one thing where it is almost necessity and will probably implement at some point. I think it would probably be a pain for all external dependencies though and probably best just reserved for ones that have special requirenents.

jonathanstowe commented 7 years ago

@ugexe absolutely, simplest definition that can work for the common case. Over thinking it at this point is likely to stop or impede it happening.

JJ commented 7 years ago

@jonathanstowe at least we could get some inspiration. I don't know of anything similar in the Perl ecosystem.

toolforger commented 7 years ago

Am 02.05.2017 um 16:46 schrieb niner:

Why tell the toolchain to install a Python package via pip, when there's an rpm package available for the distro which is easier and faster to install. Or if the user is buying into the container stuff, there's probably some other way some container manager can ensure for you that a certain library will be available.

I think you hit an imporant aspect here. Packages usually don't know or care in what environment they are being installed: container, compile-to-binary, package manager, some funky experimental interpreter environment that happens to support Perl6, whatever. All that the package cares about is that certain other software is callable. This is highly environment-dependent, so the environment needs to take care of providing the dependencies.

niner commented 7 years ago

This is highly environment-dependent, so the environment needs to take care of providing the dependencies.

Exactly! One of these environments is the openSUSE Linux distribution which I'm interested in. I'd like to automatically create rpm packages for all Perl 6 modules and publish them on the Open Build Service (so they may also be available for other distributions). What I need for that is a way to automatically translate the dependencies from the META6.json to the rpm .spec file.

the-eater commented 7 years ago

@ugexe

While I get your point that's also exactly why I choose perl6's versioning notation, because on every distribution it's different, it's setup this way so it can be easily human readable but also allows for automation if needed. so changing the format to something the package manager understands is the job of the person who implements the automation for that package manager or the human reader who installs it by hand.

While https://github.com/perl6/toolchain-bikeshed allows this checking in the buildphase, it doesn't add any standardization for such checks, which may result in many different implementations that do the same thing.

@niner

It's not required to also show how to install an package, but it's also providing the option to show which package is required, so it may be automated. which would be nice for situations like #333 where I then can auto-install the dependecies required for a certain package.

While that may be true, not all package managers are "modern" in that respect, pkgng can't do none of those things, apt maybe. it's pure as I said hinting, so that it can be automated or easy to find in non "modern" package managers.

jonathanstowe commented 7 years ago

I'd expect platform specific native installers to be provided as plugins which the Perl 6 package manager knows how to obtain and install, it's unlikely that any one author is going to have access or expertise in every possible native installer for every platform and most users aren't really goiing to need the majority of the possible installers. But this is probably a point for a separate discussion.

the-eater commented 7 years ago

@jonathanstowe so you're proposing keeping the lib to package translation table in a seperate repo? If that's so I'm all for that, e.g. I can easily generate the whole translation table for VoidLinux Meaning that the native-dependencies array can be reduced to

[
  { 
    "name": "curl",
    "bin": [ "curl" ],
    "lib": [ "libcurl" ],
    "version": "4.0+"
  }
]

And the translation tables may be used to receive additionial info

Skarsnik commented 7 years ago

I already made a ticket on RT https://rt.perl.org/Ticket/Display.html?id=126744 (for tracking propose) about this issue.

It's really something needed for module that etheir bind a C lib, use an external tool or another module from another language. It can be only informative for the basic perl6 tool (like zef/panda) but can serve more purpose for distribution.

samcv commented 7 years ago

@ugexe it was not intended that the package manager be checked to see if a package was installed, that was not the goal. As I think @the-eater somewhat stated, It was my intention for the package manager to check for 'lib' and 'bin' and if they aren't there, then it can perform some action even as simple as notifying the user.

The main info was the lib and bin. (and human readable "name" field)

This spec does not seek to implement the build tools or the installation process, but merely to help people and computers identify if the libraries and binary files are in the Path. If there is much bikeshedding about the tables other than the top level and support keys in the request (similar to what eater posted above me), then we can leave that part of the spec for later. That is fine with me. The other fields for distro and package manager etc were part of the RFC to allow for exactly this type of discussion which is good. It is always good to try and look forward a few more stages in the design process to try and find potential issues with the core ideas of a proposal/plan. Thank you everybody for weighing in so far.

jonathanstowe commented 7 years ago

@the-eater yeah that looks like the kind of thing, as long as there is enough information there for something to be able synthesize the platform specific details (which should be left to the consumer/installer IMO,)

On another point, if we're going to go somewhere with this, I'd rather change the top level attribute to something other than native-dependencies as that implies something about the dependency that may not strictly be native at all (such as a Python or Perl 5 module for instance.) Maybe external-dependencies or some such.

ugexe commented 7 years ago

This is better.

[
  { 
    "name": "curl",
    "bin": [ "curl" ],
    "lib": [ "libcurl" ],
    "version": "4.0+"
  }
]

For the purposes of creating a build graph though what I'd really like to see is it simplified to how we declare our perl6 dependencies. So (just spitballing) something like:

"some-spec-extension" : [
    curl:ver<4.0+>
]

and leave everything else to whatever code/plugin knows to look for the non-standard-spec some-spec-extension in the first place.

But how we do handle this/that? windows/linux? First lets pretend the following (mentioned in s22 but not implemented anywhere) worked:

"depends" : [ ["Foo:ver<1>", "Bar:ver<3>"], "Baz" ]

We should then consider using this for our AND/OR selection with this extension. That doesn't solve windows/linux, but that isn't solved for regular depends either. So I would then suggest that we might need a way of declaring such basic conditional, and again I'd hope to keep this as perl6 as possible. Lets consider basic conditionals by mapping stuff to $*VM.config, specifically $*VM.config<os>

"depends" : [ ["Foo:ver<1>:os<win32>", "Bar:ver<3>"], "Baz" ]

or, if we must

"depends" : [ ["Foo:ver<1>:os(/^win*/)", "Bar:ver<3>"], "Baz" ]

so coming back to the actual issue we might have:

"some-spec-extension" : [
    "perl:ver<5.20>:theoretical-unsafe-hint-extension<perl -Mv5.20 -e0>",
    [ "curl:ver<4.0+>:os<osx>", "wget:ver<*>:os<linux>", "powershell:ver<3+>:os<win32>" ]
]

and some plugin would understand this as dependencies on perl 5.20 and one of curl, wget, or powershell.

My opinion on the primary purpose of META6.json is that it should provide all the necessary information to Get The Job Done (create a build graph), but it should be formatted with human read/write-ability in mind before machine (and still being json) - we want these things to be easy to write and maintain for authors first and foremost. Expanding the META6.json format already has prior art in the core Distribution::* implementations (which adds the files field, e.g. bin/ and resources/ stuff), and CompUnit::Repository::Installation (which adds the new/sha1 filename of a module in provides (so CURI knows what to uninstall, or tools can introspect stuff). Let something else turn it into some format that $whatever understands.

To summarize: keep it as simple as possible, use existing solutions to model any new solutions, and model any new solutions to also work for any current perl6 dependency shortcomings.

jonathanstowe commented 7 years ago

@ugexe I'm all for The Simplest Thing That Works :) I think there would need to be some additional stuff in there to distinguish between library and executable dependencies (probably extenisble in some way.) but consistency with the way normal dependencies work is a good thing.

samcv commented 7 years ago

I don't necessarily like this section that much:

"some-spec-extension" : [
    "perl:ver<5.20>:theoretical-unsafe-hint-extension<perl -Mv5.20 -e0>",
    [ "curl:ver<4.0+>:os<osx>", "wget:ver<*>:os<linux>", "powershell:ver<3+>:os<win32>" ]
]

So that might need more refinement. But these non Perl 6 dependencies are not Perl 6 dependencies. It seems a bit odd to declare them as such.

And this is assuming that everything is going to have some name you can give it, and without a mapping to lib or bin it is going to be useless when all that you want to do is checks for lib and bin, and ensure that it can run. It seems a little bit overburdened putting all that information in one place.

In addition it makes it much harder to add additional keys in the future if we need them, since they would all be basically eradicated by simplifying them to some Perl 6ish string.

While personally I'm not in love with it, if you want to combine the "name" fields and "version" fields to create "curl:ver<4.0+>" that is fine. I intended the name field to be some arbitrary field (semi arbitrary), and giving it this information you want would be fine with me, as long as we keep the "bin" and "lib" fields. I'm not sure I like how the OS is shown there, but those are little details. I'm not against the "name" field being used as an identifier like that, which could be mapped to things outside the project. But it is my conviction that there does need to be fields for lib and bin which can be verified without any outside special knowledge of mappings or of packages.

[
  { 
    "name":  "curl:ver<4.0+>:os<win32>",
    "bin": [ "curl" ],
    "lib": [ "libcurl" ],
  }
]
ugexe commented 7 years ago

:<> is a key/value. The above can be written as curl:ver<4.0+>:os<win32>:bin<curl>:lib<libcurl> . There is no special knowledge needed other than :key<value>. Maybe those names don't fit nicely in that format, but they fit the format none-the-less. It round-trips.

Why treat them like perl6? 1) They are being used to create alternative dependency graphs for perl6 module installs. Other stuff might do that too, but hey they can expand those out if they want. They'll have to for the depends field strings anyway 2) Perl6's naming design considered how to pass in conditionals and other logic in a namespace request, giving a way to pass them in on a CLI 3) And obviously having a consistent way of declaring what a dependency is.

jonathanstowe commented 7 years ago

Looking at it from the notion of an extension to the existing dependency specification as above does have other potential advantages, firstly there may actually be no need for a separate top level attribute in the meta data if the revised specification can indicate that it is an external resource of some kind and secondly the specifcation might conceivably be used in places where a perl 6 dependency specification can be used at run time

niner commented 7 years ago

Are we sure that all possible external dependencies restrict themselves to what Perl 6 considers ok for a name? Is there no language out there that uses a single colon as a separator? Are we sure there never will be such a language?

While I certainly see the appeal of sticking with Perl 6 syntax for great consistency, I also wonder why we would want to introduce the need for parsing those value strings, when the whole META file is already in a very structured format that can be read easily by any language out there.

ugexe commented 7 years ago

We already went the path of simplicity over terse configuration with RESOURCES, wherein we use paths instead of the tree structure mentioned in s22. Additionally for things like paths we force forward slash format regardless of os.

The name doesn't matter, so don't put a colon in it. All that really matters is the :key\<value>s, although the name can be used as the reference name later on (such as with resources/libraries/foo, or require Foo:file<foo/bar.pm6>). Plus why worry about how the colon affects the dependency reference name when you already have to use this syntax to load modules from other languages (which may normally allow colons in their name): use FooBar:from\<Python>

And again, if the whole META file is already parsable then the strings in question must then be parsable already too. This is how you declare perl6 dependencies on specific versions.

shadowcat-mst commented 7 years ago

This is an awful idea, because we have no idea what we're doing.

Please look at the CPAN Sysdeps plugin first.

Then 'specify' it as an extension that nothing is expected to actually follow.

Then we try it for 2-3 years and see if it even works at all.

Then we can consider trying to standardise it.

(note: I have been thinking about how to do this for most of 10 years ... and I don't trust my ideas to be good enough and intend to put mine through the above 2-3 year process before I try and standardise them ... so I'm not being mean to you specifically, I don't trust 10 years of my own work to get this right either and am totally including me in "no idea what we're doing")

sjn commented 7 years ago

Juan Julián Merelo Guervós said:

Alien::Base https://github.com/Perl5-Alien/Alien-Base by @perl5-alien is the closest thing in the Perl5 realm, I guess.

Please, please, please, for the love of $deity, do NOT use Alien::* as an example for doing handling native dependencies! That strategy is a horrible way to do this, and only useful in deeply dysfunctional environments (e.g companies where packagers or sysadmins are unwilling or unable to help resolve any native package deps.)

shudder

--

!/usr/bin/env perl

sub AUTOLOAD{$AUTOLOAD=~/.::(\d+)/;seek(DATA,$1,0);print# Salve Joshua Nilsen getc DATA}$"="'};&{'";@_=unpack("C",unpack("u*",':50,$'.# sjn@foo.no '3!=0"59,6!%%P\0!1)46%!F.Q%01,`'."\n"));eval "&{'@_'}"; END is near! :)

sjn commented 7 years ago

Apologies for the late reply. :)

I think @shadowcat-mst is too pessimistic here. I think we can still have a meaningful discussion, and even figure out something useful. I've also thought quite a bit about this, so I'll offer my €0.02 on the matter. :)

  1. Firstly, let' s make sure we're completely clear about what we need to solve. This is a problem related to information that has to be communicated between the author and the packager. This means have one producer and one or more consumers of this information. (For simplicity, I treat the "packager" and "installer" roles as the same. The difference is only in the amount of steps that has to be done in order to get some software correctly installed)
  2. Since we have a producer -> data -> consumer flow, it's meaningful to treat this as a protocol issue. Let's give this protocol a name! I'd like to suggest "the Deploy Anything Protocol (DAP)".
  3. A good protocol needs a few basic features in order to evolve into something useful: a. A versioning scheme for the protocol itself. b. A clearly defined policy for introducing, renaming and removing protocol features and keywords. c. Clearly defined expectations about how to handle unknown fields in the protocol (e.g. ignore), deprecated fields (e.g. warn), required fields (e.g. warn+bail out), etc. d. A clear way to classify different field types in the protocol. (e.g. "library(openssl)" or "executable(ps)") e. Rules about which characters are legal in the protocol, and how to handle escaping of special characters (e.g. good: "library(name-with-parenthesis))"; bad: "library(no-slash-in-name/)") f. A way to specify important related metadata to each dependency. (e.g. required release-version, authority, protocol-version, etc.) g. A clear naming policy for field data. E.g. "Only use the lowercased packaging name of the upstream project, without any filename-related prefixes or suffixes" (e.g. good: "library(openssl)"; bad: "library(libopenssl.so)" Leave it to the consumer to figure out what the supplied name maps to in the native system. Leave it to the consumer to ask for decision help from packager if there is confusion. h. A clear way to specify if a native dependency is optional or not (e.g. for testing, package build, etc.) i. NO CLEVERNESS. Don't allow a complex versioning DSL or any other clever logic into the protocol. No code, nothing executable. Keep the protocol simple, and put the cleverness at the edges (in the producers or consumers).
  4. Clearly state what is expected from the producer a. Offer as much as necessary for the consumer to be able to figure out what the native dependencies are, but nothing more. b. The goal is to make it easy to get working software installed, and to make it easy to have ALL dependency requirements met.
  5. Clearly state what is expected from the consumer a. Do whatever is necessary to figure out what's needed using only the data that is supplied. b. Specify what behaviour is expected if a field or some related metadata is missing c. If a native dependency is successfully identified, do what's necessary to install it. d. If a native dependency is not available, then tell the packager about this.

HTH :)

JJ commented 6 years ago

After the bitrot squashathon, where most of the problems were caused by not having an standard way of specifying dependencies, can we please come back here and decide on something? Or decide that we are not going to decide?

AlexDaniel commented 6 years ago

https://deathbyperl6.com/perl-toolchain-summit-2018-meta6-dependency-hello/

JJ commented 6 years ago

It might be that we don't really need this. zef build calls Build.pm and that can do all kind of stuff.

JJ commented 6 years ago

So this is fixed?

Skarsnik commented 6 years ago

Build.pm is just a workaround for a common need : Needing an extern library (modules for python...) to run the tests of a module. There is a part of the spec for it but it's never used because it's probably not even implented in Zef anyways. It does not need to do clever stuff in Zef, it could just be informative at first.

2018-08-05 19:33 GMT+02:00 Juan Julián Merelo Guervós < notifications@github.com>:

So this is fixed?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/perl6/ecosystem/issues/334#issuecomment-410535436, or mute the thread https://github.com/notifications/unsubscribe-auth/AMYVf0Imgf7vlTY9t2mrYKEoazcL9v3-ks5uNyx9gaJpZM4NN2Aq .

-- Sylvain "Skarsnik" Colinet

Victory was near but the power of the ring couldn't be undone

niner commented 6 years ago

http://design.perl6.org/S22.html#depends is an up-to-date spec of how we want to handle native dependencies. Important parts of it are already implemented in zef and there's already a real life example making use of this functionality in https://github.com/niner/Inline-Python/blob/master/META6.json The missing bit is support for dependency hints, i.e. code in zef that will actually download those DLL files specified by hints like:

{"name": "archive:from<native>", "hints": {
    "url": "http://www.p6c.org/~jnthn/libarchive/libarchive.dll",
    "checksum": {"sha-256": "E6836E32802555593AEDAFE1CC00752CBDA"},
    "target": "resources/libraries/"
}

Note that this code would already be passed the collapsed meta data, where differences between operating systems, etc. are already handled. It should be really straight forward.

niner commented 4 years ago

I'm closing this issue as like I've mentioned 2 years ago, we actually included the required functionality in the spec (including dependency hints), have an implementation in zef and even the working translator from META6.json to rpm .spec files I wanted for packaging on the OBS. The most important next step is getting module authors to actually state their native dependencies in META6.json files and we're gathering some steam on that right now :)