Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

Test Raku modules against different OS, Rakudo versions #144

Closed melezhik closed 4 years ago

melezhik commented 4 years ago

Run Raku modules installments/tests on dockers to check modules health against different rakudos/os/environments.

Things that modules authors might want to check. Things that Rakudo release engineers might want to check.

melezhik commented 4 years ago

The issue could be solved through https://github.com/melezhik/RakuDist - tool launch Raku modules tests in docker containers. RakuDist consumes Sparrow6 DSL scenarios allowing module authors to describe required environments and testing scenarios in high-level Raku DSL. Authors could reuse a lot of existing functions, plugins.

Altai-man commented 4 years ago

First and foremost, how does RakuDist relate to Blin project?

AlexDaniel commented 4 years ago

@Altai-man it's a good point. Blin is designed to test and bisect modules on different rakudo versions, whereas RakuDist is designed to test each module on a variety of platforms/configurations. There is definitely some overlap in functionality, but I'm not sure what it means. Should RakuDist be extended to allow bisecting, or should Blin be extended to be able to run things in docker, or should both tools continue to live their own lives?

melezhik commented 4 years ago

After brief reading of https://github.com/perl6/Blin I could say:

( I might be wrong, I would ask @AlexDaniel as he might know better )

RakuDist:

AlexDaniel commented 4 years ago

Blin does not allow choose different docker images , so does not allow to test against different OS

Currently, yes, that's correct.

Blin does not allow module authors to create testing scenarios

Technically you can use Blin with arbitrary scripts, so if you really have to do something like this it's possible. But Blin is more like a local version of Bisectable, you use it if you want to see if there's a regression or if you want to find when this regression happened. At the same time I see RakuDist as more like a CI service.

melezhik commented 4 years ago

The main difference is flexibility, module author could create their own scenarios, for example if you need to test Cro from git dev branch for both alpine and debian:

(https://github.com/melezhik/RakuDist/blob/master/modules/cro/)

cat sparrowfile:

my $user = config()<user>;

my $directory = "/data/test/{$user}";

my $scm = config()<scm>;

user $user;

directory $directory, %( 
  owner => $user,
  group => $user

);

git-scm $scm, %( 
  to => $directory, 
  user => $user,
);

bash "cd {$directory} && git log --name-status HEAD^..HEAD", %(
  description => "last commit"
);

bash "cd {$directory} && ls -l";

zef "Test::META";

if os() eq 'alpine' {

  # this is needed for alpine rakudo installation
  unless "/bin/zef".IO ~~ :e {
    copy "/opt/rakudo-pkg/share/perl6/core/bin/zef", "/bin/zef"
  }

  # this is needed for alpine rakudo installation
  unless "/bin/perl6".IO ~~ :e {
    copy "/opt/rakudo-pkg/bin/perl6", "/bin/perl6"
  }

}

package-install %(
  apline => "make gcc libc-dev openssl-dev",
  debian => "libssl-dev"
);

zef $directory, %( 
  force => False,
  depsonly => True, 
  notest => True, 
  user => $user 
);

bash "cd {$directory} && zef test . --verbose", %(
  description => "zef test",
  user => $user
);

cat config.pl6:

%(
  user => "cro",
  scm =>  "https://github.com/croservices/cro.git"
)
melezhik commented 4 years ago

Fuhrer take on possible design/scaling issues:

Thoughts on RakuDist scaling ( copied from email thread )

@melezhik :

Hi Alex!

So far we have everything to run Raku modules installments/tests on dockers to check modules health against different rakudos/os. The issue is a scaling.

I am torn between 2 choices.

  1. Run tests on centralized cluster which is expensive. Rough estimation is if we have around 1000 modules, 3 rakudo versions and 3 os distros, and if an average test for one module takes roughly 3 minutes and we run tests sequentially, then we could run 20 tests per hour on one node, roughly 500 tests per 24 hour on one node. So we end up in requiring roughly 20 nodes cluster to cover 9000 tests per 24 hour. I don't have a spare 20 cluster, do you? ((-; Also tests could be network expensive assuming we might want "from the scratch" tests pulling docker images and other dependencies ( external libraries, raku modules ) over the internet every time. All this could add to the costs whatever cloud provider we might want to choose.

  2. That being said. Run tests on users machines is alternative. Users might run tests on their machines ( e.g. run docker on their machines and then run sparrowdo on a docker container to install raked, install a module and then run module tests - we could even create a cli tool to encapsulate all this logic so that users could use it instead of running low-level commands directly, anyway ) and report results back to a centralized API. The load would be spreaded. Users also could choose for which modules / rakudos / os they want to run tests. Presumably modules authors would like to run tests for theirs modules first. The challenge however with this approach there is no guarantee that users will be persistent enough to run tests often enough to cover exiting modules and keep up with new rakudo releases.

What are your thoughts on that?

Best

Alexey


@AlexDaniel :

It's a good summary, thanks! I think we should probably start a problem-solving issue on this, where you should first define the problem (that for module authors it's hard to know if their modules work on different operating systems and rakudo versions), and then describe your created solution as something that can resolve the problem. There I think @rba and others will be able provide required feedback.

As for number 1, please don't overestimate the amount of computing power required, it is actually extremely small. I expect it to continue being so for the years to come. I used to run Blin on a google compute instance with 24 cores and it used to take just 3-4 hours (that was a few $ per month, it wasn't too horrible). Blin tests every single module at least once (more than once if it fails any tests). From what I last heard from @rba and sena_kun, they now have a machine dedicated for that, and they're able to run Blin twice per day. That's not extremely fast but it's not bad either. I think the biggest miscalculation in your estimation here is that most modules are not being actively developed, meaning that once you tested them you will no longer have to test them again. I know it may sound a bit surprising and counter-intuitive at first, but once we optimize/cache everything the amount of cpu time required will be ridiculously small (just like Bisectable has over 9000 rakudo builds stored in just 31 GiB, and is able to bisect any code in just a matter of seconds). Anyway, I think number 1 is the way to go.

Ah, also, I think number 2 should be possible too. I have made a few of centralized tools (Blin, bisectable, etc.) in the past and people always complain that it's hard to run them locally. :(

ugexe commented 4 years ago

A couple of things:

1) Docker is less than ideal. Ideally tests happen with the same syscalls that would occur when normally running the host OS, not just translating them.

2) Covering a significant number of build combinations will be nigh-impossible. There are so many variations in just native dependencies that providing a wide range of environments would be a difficult task.

3) We used to have a distributed rough draft of this in http://testers.perl6.org -- zef even reported here if prompted (and indeed used to work with cpantesters although I dunno if its UI showed those results). This avoids having to do the work of setting up environments from an architectural perspective and lets the users provide the relevant data of whats actually being used.

4) The information from the p6c ecosystem (i.e. against modules pointing at HEAD) would be mostly useless beyond providing some facade of reliability testing to the outside world.

melezhik commented 4 years ago

@AlexDaniel I am attracted by the idea number2 of running stuff on users machines so we don't need to bother we infrastructure a lot, we only need to create a simple reporting API that would accepts test statuses and reports from all the users machines and exposes results in nice, searchable interface.

But it takes more impact on users/community. Will people run those tests proactively, often enough, so that all idea makes a sense?

melezhik commented 4 years ago

Covering a significant number of build combinations will be nigh-impossible. There are so many variations in just native dependencies that providing a wide range of environments would be a difficult task.

We used to have a distributed rough draft of this in http://testers.perl6.org -- zef even reported here if prompted (and indeed used to work with cpantesters although I dunno if its UI showed those results). This avoids having to do the work of setting up environments from an architectural perspective and lets the users provide the relevant data of whats actually being used.

fair point @ugexe

We still have a choice to run sparrowdo on bare machines over ssh or localhost! and then report back to centralized API.

zef even reported here if prompted

it's true. but my thought I don't want to limit people by just unit tests. also zef report won't probably show all infrastructure/environment/os information ( or at least such an information would be terse - please comment here, I don't know ), while Sparrow6 scenario has infrastructure as a code - that specify desired state - like name/versions of various libraries , also in Sparrowdo reports will have a plenty of such a logs, like os name, libraries versions, so on.

Again idea is to create testing scenarios and run them wherever you want them to run, it could be different targets. Some people might want to fix bugs / anticipate regressions for some OS/rakudos they don't have by hand ...

Maybe it's like @AlexDaniel said it's "more like a CI service."

ugexe commented 4 years ago

zef can report any information you want -- https://github.com/ugexe/zef/blob/master/lib/Zef/Service/P6CReporter.pm6#L13-L60 -- just code whatever logic into whatever reporting plugin.

If you want a CI service then just use one of the many backed by large funding and workers, or create a tool to make it easier for people to do this. All we are really interested in are the results. There really isn't much of a reason to have some specialized language testing service in 2020.

melezhik commented 4 years ago

zef can report any information you want --

I meant more then that, like external libraries versions and so on, but thank you for that.

If you want a CI service then just use one of the many backed by large funding and workers, or create a tool to make it easier for people to do this.

It's not just about a CI server as a tool to run external run jobs, it's more about having an easy way to write some integration tests and run them and get results, imho Sparrow6 DSL could help with that.

ugexe commented 4 years ago

it's more about having an easy way to write some integration tests and run them and get results

Sure, but existing CI tools aren't complex because they simply didn't want to be easy -- they are the way they are because its necessary for the complexity of the feature-set being requested. If sparrow dsl can help write those tests easier for raku users on existing CI platforms then that would be the most pragmatic solution.

melezhik commented 4 years ago

If sparrow dsl can help write those tests easier for raku users on existing CI platforms then that would be the most pragmatic solution.

Yeah, it boils down to two questions:

On the second question I would still start with something simple like docker/sparrowdo and see if it gets practical results and then later we could switch to exiting CI tools if we see the necessity of it.

As I could see the main goal here is to have a useful results - like people see bugs in specific os, rakudos, environments and they are happy with how ( easy and simple way ) they could have such a results.

niner commented 4 years ago

What exactly would sparrowdo's job be in this? I thought this is just about testing modules which presumably bring their own tests already?

melezhik commented 4 years ago

Sparrow6 brings infrastructure as a code, it's not enough to run unit tests, a lot of preparation work needs to be done sometime:

Thus Sparrow6 brings you a DSL and building blocks to automate testing environment preparation. See examples I've already mentioned.

melezhik commented 4 years ago

It also worth to mention that it's dead easy to extend Sparrow6 ( create Sparrow6 plugins or Sparrow6 tasks ) to allow users to meet their specific needs. Some modules might require complex setup.

niner commented 4 years ago
  • install external libraries ( libssl for Cro, lissqlite for Red, so on )

This should be part of the module installation. If we finally automate installation of external dependencies, we really should not restrict this to some testing service.

melezhik commented 4 years ago

This should be part of the module installation.

It's a part of configuration management. I imho doubt it should be done directly during module install. if we talk about zef, it just needs to notify us that there are some dependencies/libraries are missing, and indeed it does. It works the same way for all other modules managers I am aware of (pip, cpan, ribygems )

Long story short - configuration management is challenging task and should be done by dedicated tool, not module managers.

we really should not restrict this to some testing service.

Sparrow6 is a tool, not a testing service, it provides a capabilities to write "deploy/configuration/test" scenarios as a code, you are free to use these scenarios for various purposes - testing, ci, whatever

niner commented 4 years ago

On Freitag, 3. Jänner 2020 22:27:27 CET Alexey Melezhik wrote:

if we talk about zef, it just needs to notify us that there are some dependencies/libraries are missing, and indeed it does.

How? Do people actually put their external dependencies into META6.json files? That would be a huge win already.

It works that way for all other modules managers (pip, cpan, ribygems )

The whole point of Raku is to do better than other languages. Otherwise, we wouldn't have had to start the whole business.

melezhik commented 4 years ago

How? Do people actually put their external dependencies into META6.json files? That would be a huge win already.

I meant indirectly, afaik zef does not handle external libraries dependencies, at least yet cc @ugexe , but now if one, say have dependencies on libssl and it's not installed they gonna get an error produced by make or whatever during installation phase

The whole point of Raku is to do better than other languages. Otherwise, we wouldn't have had to start the whole business.

I don't consider it's a good idea to manage all complexity of configurations/environments inside module managers, imho it's should be part of separate tool (cm tool actually), Sparrow6 is one of them.

ugexe commented 4 years ago

I meant indirectly, afaik zef does not handle external libraries dependencies

You can declare native and executable dependencies ala curl:from<bin> and curl:from<native>. The former does a naive search on PATH, whereas the later does the same name-mangling as %?RESOURCES to detect if a native dependency is installed (i.e. it'll check for libcurl.so or curl.dll the same as $*VM.platform-library-name.

However -- currently this is only used for zef to give better errors to users when they are missing a dependency and to avoid downloading / building a bunch of modules if we know before hand it won't work. It does not install native dependencies, although that is entirely doable via a plugin (I wrote a naive fork/demo that installs :from<Perl5> dependencies via cpanm for instance).

melezhik commented 4 years ago

Hi @ugexe . cool.

Irrespectively of zef capabilities, the point I am trying to make we need an integration layer. Tests are not just only "zef test", it's just a part of the story. Sparrow6 and zef could be good partners. (((-: The advantage of Sparrow6 I am trying to advertise here - it has all battery included high level DSL for such a tasks. Again code examples mentioned here are quite self-employed.

melezhik commented 4 years ago

Sorry for the typo in the last comment. I meant self-explanatory ((((-:

niner commented 4 years ago

I think our best bet is to resist the temptation to build something ourselves, but instead use something created and maintained by others. The Open Build Service has already proven to be an invaluable tool. Not just for building packages for OpenSUSE of MoarVM, rakudo and many modules, but also to point out issues and bugs on e.g. 32 bit machines.

Whenever one uploads a new version of a package, the Open Build Service rebuilds all dependent packages. The rebuild can include running tests, so any incompatibilities will show quickly. It even points out if there are any differences in the built files on a rebuild. Every build runs in a completely fresh VM with all dependencies installed. In contrast to e.g. Travis it doesn't clone git repositories all the time, using only local resources instead, so builds are reliable.

Despite its origins, the Open Build Service supports many distributions, from Arch, to Fedora, to Raspbian, to Ubuntu on i586, x86_64, ARM, PowerPC and even RISC-V. The installation on build.opensuse.org which is free to use currently runs over 1200 build hosts. Everything's free software and several large projects or organizations run their own installations.

There's a web interface like https://build.opensuse.org/repositories/devel:languages:perl6 but also a command line interface and API, so automatically pushing new ecosystem uploads to the build service is easily automatable.

As a side effect, we'd even get easily installable packages for all the distributions we test on, plus AppImages.

ugexe commented 4 years ago

I don't disagree, but thats a bit apples and oranges unless OBS provides Windows, BSD, etc platforms.

melezhik commented 4 years ago

OBS is for building native packages. The issue is being solved here is to test pure installation through zef, not building native packages.

melezhik commented 4 years ago

Another thoughts on OBS:

Also you end being a build engineer that has to understand all the underlying formats for certain platforms ( debian, centos, rpm, deb whatever ), while with Sparrow6 you just create a high-level scenario, much easier!

package-install %(
  apline => "make gcc libc-dev openssl-dev",
  debian => "libssl-dev"
);
niner commented 4 years ago

The OBS doesn't support Windows yet, but at least there's a draft on how to get there: https://en.opensuse.org/openSUSE:Build_Service_Concept_Windows_Support

Btw. the backend is written in Perl: https://github.com/openSUSE/open-build-service/tree/master/src/backend

Compared to building all of this ourselves, helping to get platform support up sounds much easier.

Yes, to build on the OBS, we'd have to provide native dependency (and build) information in platfrom dependend formats like RPM or deb. That however is easy to do once we have the information in the module dist's META6.json files. All we have to do then is write a set of scripts for converting this information like I did for RPM in https://github.com/niner/meta2rpm

All the spec files for the already existing module packages in https://build.opensuse.org/project/show/devel:languages:perl6 were created using those scripts.

The hard part is always going to be to get the information into those META6.json files in the first place, regardless of how we process them later. And that step is part of every solution we come up with. But this is time well spent, since once the information is there, we can use it for testing, packaging and installing.

The advantage is that module authors do not have to become build engineers and will have to understand neither formats for any platforms, nor Sparrow6 or any other configuration management system. They just have to understand how to record native dependencies in their META6.json files and all we need there is information they already need for writing their code in the first place.

In your Sparrow6 example, as a module author I'd have to know that there's an openssl-dev package on Alpine and that it's called libssl-dev on Debian. That's precicely the platform dependend details that we don't want our module authors to need. Every attempt to do so in Perl has failed. Do we really expect authors to provide this just for tests, when it will not even help users with installation?

The OBS' platform support is indeed still limited to just 56 distributions and versions on 6 different processor platforms. Our own solution supports 0. How long do you think will it us take to even catch up? How much effort will it require? And how many more platforms would the OBS support if the effort were spent there? They have 17 developers working just on that service. I would love to have 17 developers work regularily on the whole of rakudo. Our whole infrastructure team is.... Roman Bauer.

Will the OBS provide everything we can dream of out of the box? No, of course not. But it will give us a hell of a head start.

AlexDaniel commented 4 years ago

@niner would we need to run it ourselves or is there any other option?

The advantage is that module authors do not have to become build engineers and will have to understand neither formats for any platforms, nor Sparrow6 or any other configuration management system

Yep. That sounds good.

niner commented 4 years ago

@niner would we need to run it ourselves or is there any other option?

Depends on what you mean by "it"? The OBS? No, I'd propose to at least start with the public instance on https://build.opensuse.org to make use of the maintenance and the > 1200 servers. Running our own can be an option for somewhere far down the road.

We'd have to run the scripts that generate the package definitions ourselves. That would basically be a cron job with negligible resource usage.

melezhik commented 4 years ago

Hi @niner, thank you for your comments.

In your Sparrow6 example, as a module author I'd have to know that there's an openssl-dev package on Alpine and that it's called libssl-dev on Debian.

You'd have to do the same in your supporting scripts.

IMHO The advantage of Sparrow approach it's visible and it's flexible. Raku module authors could change sparrowfiles anytime and could also chose to support only those platforms they want ( some distros don't have proper packages and so in ). When you encapsulate this implementation somewhere else people would be much more reliant on OBS maintainers or whoever supporting such a service.

You say module authors don't have to be a build engineers if they just put their dependencies in declarative style in meta6.json, yes you're right, but still someone else has to be a build engineer and this will be people who support "Raku modules and OBS integration". You just put the complexity from one place to another.

With all respect the main challenge with your approach stands the same for me.

I don't want to maintain build distributions for variety of platforms for the sake of testing those modules only. It's overkill and unnecessary efforts.

Also it seems to me you overestimate the learning curve of Sparrow. It's really simple and it works right now - https://github.com/titsuki/raku-Chart-Gnuplot/issues/40 https://github.com/bbkr/HomoGlypher/issues/2#event-2926843296 https://github.com/FCO/Red/issues/421#issuecomment-572219687

In average it takes from 1 to 3 minutes for me to drop a testing scenario for every Raku module I've seen so far. It won't be a big deal problem for Raku developers to do the same, please read this simple post with examples.

It's simple, it resources efficient, it does not require external service, does not require changing exiting ecosystem (tweaking meat6.json files or whatever ), it works now.

FCO commented 4 years ago

Would any of these solutions make it possible, for example, to test Red, run some different databases to test each driver?

melezhik commented 4 years ago

Good question @FCO . It's very easy with the Sparrow as it is designed for such a tasks, I may drop a few examples to RakuDist/modules/red

nxadm commented 4 years ago

About Blin: There is no technical limitation to only run it in a Debian container (that was requested).

AlexDaniel commented 4 years ago

About Blin: There is no technical limitation to only run it in a Debian container (that was requested).

Theoretically that's correct, but currently it's not possible. It's using whateverable builds (to avoid spending time on building rakudo when bisecting) and that's only available for a single platform. It's fixable and I was planning to do it for a long time, but it's not there yet.

melezhik commented 4 years ago

@FCO I have created an example of running Red tests with Pg. Now test fail not sure why, my guess is Red does not recreate database for every test but we can discuss it in appropriate topic. I am just choosing this one because you asked and it's a good example of what Sparrow can ...

melezhik commented 4 years ago

We could describe none Raku dependencies in this format, pack them to a Raku module distrubution and extract them using zef-fetch sparrow plugin. So Raku distribution could have a .rakudist/sparrowfile to describe all exta installation logic ( external packages, services , so far ).

For example this approach works for both CPAN/GitHub based distributions:

Raku module

cat .rakudist/sparrowfile

package-install ('make', 'sqlite-dev')

Installation scenario

my $module = 'Teddy::Bear'
my %state = task-run 'zef fetch Kind', 'zef-fetch', %(
  identity => 'Kind'
);

if "{%state<directory>}/.rakudist/sparrowfile".IO ~~ :f {
  # custom installation logic, for example external libraries
  EVALFILE "{%state<directory>}/.rakudist/sparrowfile";
}

# install/test Raku module
zef %state<directory>;

This is more or less how RakuDist works now.

melezhik commented 4 years ago

As RakuDist has been launched on community infra now, I'd suggest to close the issue.

lizmat commented 4 years ago

Thank you for making all of this possible. Closing now.