arr-ai / arrai

The ultimate data engine.
http://arr.ai
Apache License 2.0
20 stars 15 forks source link

Provide controlled opt-in access to non-hermetic behaviour. #528

Open anzdaddy opened 4 years ago

anzdaddy commented 4 years ago

Please do not post any internal, closed source snippets on this public issue tracker!

Purpose

Arr.ai is intended to be a hermetic language. That is, it should be defined purely in terms of its inputs and outputs, having no uncontrolled interactions with the outside world such as reading an arbitrary file, accessing network resources or invoking other programs.

We still want to be able to do those things, but we want the user to have full control over what an arr.ai program is permitted to do.

Suggested approach

Command-line options

One approach is to offer command-line options that enable access to these functions. E.g., the following would decode a JSON file in the current directory.

arrai eval --fs=. '//encoding.json.decode(//os.file("/abc.json")).foo'

Even the current working directory might be subject to an opt-in:

arrai eval          '//os.file("abc.json")'   # Always fails.
arrai eval --cwd    '//os.file("abc.json")'   # OK if abc.json is readable.
arrai eval --fs=foo '//os.file("/abc.json")'  # OK if foo/abs.json is readable.

Other options might be:

Config file (not mutually exclusive with command-line options)

The file ~/.arrai/external.arrai, if present, would be evaluated and treated as if it were passed to arrai via --ext=@~/.arrai/external.arrai.

This could also be definable at module level, but it gives application authors unfettered access to the user's environment.

Library metadata comments

Libraries might want to express the need for certain options be present. For instance, a library that fetches wikipedia entries might need --net=GET:www.wikipedia.com. This is difficult to infer from actual access patterns for a couple of reasons:

  1. The URL doesn't indicate what the base path should be.
  2. Access might not be consistent across calls, making it difficult to be certain whether all cases have been identified before putting it to use in production.

This could be expressed as:

let $@external = (net: {|method,base_url| ('GET', 'www.wikipedia.com')})

When arr.ai see this, it will fail, but offer suggestions for how to enable access through a command line or config file.