apify / actor-whitepaper

This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.
2 stars 0 forks source link

Allow runtime environment definition in `actor.json` instead of using Dockerfile #17

Closed mnmkng closed 2 months ago

mnmkng commented 1 year ago

I've been thinking about this for quite some time. Our Dockerfiles are becoming increasingly more complex. The last complexities come from multistage builds with TypeScript. There's always something to fix and it's an ever growing pain to manage.

⚠️ I'm not saying that we should replace Dockerfiles. They need to stay, but...

Most of our actors use one of our predefined base images AND predefined Dockerfile templates. In most cases, users shouldn't need to change those at all. Issues like this or this happen repeatedly for years.

Proposal:

Allow specifying the runtime in actor.json instead of in Dockerfile. E.g.

{
    "environment": {
        "engine": "node:16", // or python
        "browsers": ["chrome"]
    }
}

Or, although I would not suggest this, but it could be useful for quickly running new projects.

{
    "environment": "auto"
}

This would detect the env based on presence of package.json or some Python files and then analyzing the package.json for dependencies. Basically something like we had for single file, but instead of parsing code, it would parse package.json.

Possible benefits:

What do you think? @jancurn @mtrunkat @fnesveda @dragonraid @B4nan

B4nan commented 1 year ago

Few random thoughts:

So overall, why not, I just wouldn't expect it will solve much of our current problems, rather it will simplify things for users (which is always nice), but only for those that will go with the Apify CLI right ahead - for crawlee users that will try to deploy on the platform it won't change anything.

mnmkng commented 1 year ago

Yeah, in a way I agree with the comments. But I think we could have better insight into what's actually happening in our images if we were only copying source into them as the last step instead of doing the whole pre-install (in base dockers) + re-install (in userland) magic that we're doing now. Basically, we would control a larger part of the build process.

mtrunkat commented 1 year ago

I see the problem, but I am afraid that by hiding this complexity, we get into very similar problems as we had with single file - i.e.

As a plus, I see that this way, we could have been able to limit the number of image variations and so the size of builds and increase the performance of workers.

I'd first consider more solutions to this - maybe we can somehow improve the images or bring better defaults?

jancurn commented 2 months ago

Closing this for inactivity