icfpcontest2020 / dockerfiles

MIT License
15 stars 37 forks source link

Is single source constraint hard? #11

Open autotaker opened 4 years ago

autotaker commented 4 years ago

I looked Dockerfiles in this repository and find that only single source is copied into container image.

Current submission system favors specific languages that have ability to pack multiple sources into single source (for example, JavaScript)

Is the single source constraint hard? If so, is there any restriction on the size of the main source?

autotaker commented 4 years ago

If there is no size restriction, players may compile their sources (written in any languages) to WebAssembly and include it in main.js.

beevee commented 4 years ago

No, single source constraint is not hard.

Due to my lack of experience with the majority of requested languages, I was able to write only the most basic Hello world example. You are free to send a PR that lifts the single file constraint for your favorite language.

beevee commented 4 years ago

There is no hard size restriction on repository right now.

As expected, if a team's submission becomes too much of a challenge for our build system (hours of build time, gigabytes of binaries, etc.) we reserve a right to enforce some sensible size restriction before or during the contest.

autotaker commented 4 years ago

You are free to send a PR that lifts the single file constraint for your favorite language.

This is a very good news. Thank you.

Yet, I'm afraid that there will be a "war" of choosing build tools and libraries. There are a number of build toolchains for each language and it's a kind of each team's preference which tools and libraries are used.

Instead of submitting PR, how about each team publishing their favorite docker image (for building their source code) on public repository (for example DockerHub)?

beevee commented 4 years ago

If there is any disagreement between teams in terms of build tools, we can always make a "fork" (e.g. two "languages" called haskell-stack and haskell-cabal).

We considered letting teams use any Docker images they like, but decided against it mostly because of submission system stability, performance and security. We need to keep limited number of small base images to benefit from caching. We need to have control over these images to prevent frequent changes that break things during the contest.

autotaker commented 4 years ago

If there is any disagreement between teams in terms of build tools, we can always make a "fork" (e.g. two "languages" called haskell-stack and haskell-cabal).

How about libraries? Each team will use a different set of libraries, including them all will cause dependency hell.

I expect that eventually each team will create their fork of build base image. Then, your requirement:

We need to keep limited number of small base images to benefit from caching

is infeasible.

beevee commented 4 years ago

Each team will use a different set of libraries

Knowing the nature of the task, I really don't expect that.

Adding libraries to base image is reserved for languages with a very limited standard library. Any specific dependency you should just embed into your source code.

autotaker commented 4 years ago

Knowing the nature of the task, I really don't expect that.

Okay, I believe you.

Adding libraries to base image is reserved for languages with a very limited standard library. Any specific dependency you should just embed into your source code.

What do you mean "embed into your source code"? Is it okay to embed third-party library artifacts (such as jar file or tar file) in our repository?

beevee commented 4 years ago

Is it okay to embed third-party library artifacts (such as jar file or tar file) in our repository?

Yes, it is okay to put anything you like in your repository.

As long as you don't break any licenses and your submission build time is under 10 minutes, including shallow repository cloning time. So, please don't put Blu-ray disk images in your repo, but otherwise you can submit anything you want.

You can probably even take base python image and write a simple script that launches an included precompiled binary.

autotaker commented 4 years ago

You can probably even take base python image and write a simple script that launches an included precompiled binary.

It is great news! Now my concern is disappeared. I'm really looking forward to the contest! Thank you for the information.

last-g commented 4 years ago

This literally means that nothing except python/bash is usable and build step is rather a fiction. It's not possible to use C++ platform because:

  1. you are limited to header only libraries
  2. compiler flags are necessary for any compilation and there is no universal set of flags that will fit every team

I really appreciate that you tried to make it easy for beginners and made it easy in the very basic case but, unfortunately, this setup is not friendly to any other case and anything more complicated than "hello world".

E.g. You mentioned that I'll have to communicate over HTTP. In python where the HTTP client is present in standard library it's quite low-level and not easy to use. This means I'd want to use requests library but this will require some non-trivial machinery to just setup the environment. And another team decide to use aiohttp so you can not just add requests to a main Dockerfile for everyone because you will need to add every possible library I might want to use.

And a kind reminder that git in its basic nature is not good at storing and manipulating multi-megabyte blobs.

Why not just allow to submit docker images and/or use custom docker files?

cc: @beevee

last-g commented 4 years ago

We considered letting teams use any Docker images they like, but decided against it mostly because of submission system stability, performance and security. We need to keep limited number of small base images to benefit from caching. We need to have control over these images to prevent frequent changes that break things during the contest.

You still can keep a limited amount of allowed base docker images and just allow to customize the the last layer as the user wants. It's much better for caching compared to storing blobs in git. Providing an ability to submit images will address all the stability and security concerns for you and will allow you to save some servers used for the build farm.

cc: @beevee

last-g commented 4 years ago

I'm also a bit concerned about amount of burden that will fall on the organizers team once the contest begin and contestants realize what do they miss from the build/submission system.

cc: @beevee

nya3jp commented 4 years ago

I imagine that it might be risky for the contest organizers to change the submission mechanism in two weeks before the contest.

Regarding the specific concern about C++ solutions, I wonder if supporting a build system like CMake can work. Also I've just contributed a Dockerfile for building C++ solutions with Bazel (#12).

beevee commented 4 years ago

@last-g I appreciate your concern, but I don't think that our submission system is too restrictive as it is. Certainly not only-python-or-bash-is-usable restrictive.

Compare:

That said, I would very much appreciate someone to rewrite cpp platform using CMake instead of plain g++. I think it will provide all participants with plenty build configuration options.

But, of course, you can leave it as it is and submit binaries. Sigh.

last-g commented 4 years ago

@nya3jp I also consider these risks but it's better to debug and fix things before the contest than to realize it during contest.

@beevee thank you for your reply!

I was just highlighting a problem using C++ as an example. Bazel (thanks @nya3jp!) or CMake will work for some teams and won't work for others. Is the set of libraries baked into Bazel enough? Or will teams just rely on Bazel ability to download artifacts at build time? And if yes, what's the difference from just having a dockerbuild which executes arbitrary bash script?

I can build a generic image with user customizable build step and entry point if you are fine with possible caches size issue.

In any case I'd suggest to use entry point wrapper like /run.sh in every base image which can be overridden by contestants. So contestants at least gain full control on how their code is launched and runtime settings. There is absolutely no reason to bake these into base docker file.

elventian commented 4 years ago

@last-g Codingame has their week contests, and it's Not easy at all, but I think we hope for something more complex and unpredictable than Codingame. Something we cannot prepare for. But maybe this submission program will just be a small part of task, and we will run main programs and scripts by ourselves? Anyway, please add at least C++ library for HTTP

beevee commented 4 years ago

@last-g I can promise you two things:

  1. We will consider adding run.sh and build.sh as extension points for all supported platforms.
  2. We will consider extending our starterkits to show an example of HTTP interaction and include all necessary libraries.

If you have a vision for a generic image that would suit your taste, please send a PR, and we will give it an honest consideration. I'm not against your ideas at all.

@elventian I can promise you one thing: this contest is much, much more than just pushing small snippets of code into remote repositories with Dockerfiles. But I can't tell you much more right now :-)

last-g commented 4 years ago

@beevee thank you!

I made a pull request which introduces generic platform (#22) and can be used as a base for other platforms (conceptually at least).

Unfortunately, this doesn't resolve security concerns and I see some approaches:

  1. Organizers are fine with setting up a safe(for them) environment with internet access to run builds. That's possible, essentially all the public CI systems do that, Cloud providers also do that;
  2. Organizers are fine with receiving pre-built images as solutions. This will require to setup docker-registry and provide teams with access tokens;
  3. ICFPC 2017 kind of approach when build is split into two stages:
    • safe. Where organizers have full control and build process has internet access. At this stage buildsystem can request some metadata from contestants like a list of packages to be installed and/or a list of URL to be downloaded;
    • unsafe. Where contestants have full control and build script has no internet access.

@elventian there are too many options for C++ HTTP library and you'd want to pick the one that fits your needs and programming style.