Effect System: Tags that are disallowed by default, and blocking tag propagation

snej commented 3 years ago

The effect system would be more broadly useful if it were possible to declare that a tag is disallowed by default, such that a proc without a tags pragma is considered to not have that tag. This would allow for tags that delimit special runtime scopes within which certain procs can be called, with strong compiler checks preventing calls elsewhere.

For this to be useable, there would also need to be a language construct to block propagation of a tag, so that a function can call tagged functions without itself propagating the tag; otherwise disallowed tags inevitably propagate all the way up the call chain (and presumably cause an error in main.)

Motivating Examples

1. Transactions

Consider a C database API where records may only be modified between beginTransaction and endTransaction calls. We want to provide a better API in Nim, with an inTransaction(callback) proc that starts a transaction, calls the proc, and then automatically ends it. Now we want to enforce that calls like insert, update, delete are only called inside inTransaction.

We define a Transactional tag, mark it as disallowed-by-default, and use it to tag insert, update, delete. Now any client code that calls these functions is tainted with that tag. The inTransaction proc tags its callback function type as Transactional, so the code invoked there can call those functions.

The second necessary enhancement is a language construct that allows code in a block to use a tag, but which does not propagate the tag upwards. Without this, disallowed-by-default tags are useless because they propagate to the top level, and main won't allow them. This construct is equivalent to the try: statement for exceptions.

type Transactional = object
 {.disallowTag: Transactional.}

type TransactionCallback = proc() {.tags: [Transactional]}

proc inTransaction(db: Database, callback: TransactionCallback) =
  beginTransaction(db)
  allowing(Transactional):
    callback()
  endTransaction(db)

(There should be a try:...except: block there too, but I've left it out for clarity.)

With this API, the client will get an error if they try to modify records outside a transaction. If they write functions that modify the database without using a transaction, they'll have to tag those functions Transactional, which helps document their effect. Those functions can safely be called inside an inTransaction callback, but nowhere else.

2. Unsafe Operations

This can be used to implement a weaker form of Rust's unsafe language feature. Here unsafe would be a disallowed-by-default tag. Low level functions that do unsafe things (for some definition of "unsafe") can be tagged with it. The library's public API will probably omit the tag, if the author has a goal of not exposing the un-safety to clients of the library.

(What makes this weaker than Rust is that it doesn't detect language-level unsafe behaviors like dereferencing ptrs, or calls to unsafe standard library functions. The library author will have to be diligent about tagging their functions that use those.)

With this setup, when code somewhere in the library calls an unsafe function an error will be produced at the call site. The author now has two choices: they can tag that proc as unsafe, or they can check and handle the unsafe conditions (with range checks or whatever) and then wrap that code in an allowing(unsafe) block.

With this done, the compiler will find and flag any cases where someone's forgotten to handle an unsafe behavior. (Of course it can't check that the way the unsafety was handled is correct; most likely only a human can. But the allowing(unsafe) blocks indicate areas to focus on in code reviews and security audits.

Proposal

(I'm trying to be detailed, but I am not attached to the names here, just the functionality.)

A pragma {.disallowTag: T.}. After this pragma is parsed, an untagged proc declaration is interpreted as not allowing tag T, i.e. within it, calls to procs tagged T are errors.
A new block-level statement allowing(T):. T must be a type used as a tag. Within the following block, calls to procs tagged T are allowed, yet the containing proc is not tagged with T.

Notes

Separation

The allowing statement would be useful even without disallowTag. You'd just have to add {.tags: [].} manually, to functions that you don't want leaking the tag.

Callbacks and `proc` Pointers

Since procs used as data types are also subject to tags, a disallowed-by-default tag is also disallowed in a callback or proc-pointer that doesn't explicitly allow it.

This is somewhat annoying for common library routines that take proc pointers, especially functional utilities like map or filter. (This is already a problem that comes up in languages with checked exceptions, like Java.)

Separating "disallowTag" From The Tag Declaration

I've chosen to separate the disallowTag pragma from the declaration of the tag type. This lets the enforcement be optional. If modules that declare and use the tag doesn't disallow it, that gives clients of those modules the option to opt into having the tag checked, by adding the pragma in their own source code. But if the module defining the tag thinks it's important, it can put the pragma right after the tag declaration.

planetis-m commented 3 years ago

In Nim writing your own datastructures using ptr is trivial. It would make it cumbersome for no benefit at all. In Rust unsafe is even abused to write getters with or without checks, .... nothing else to add.

snej commented 3 years ago

In Nim writing your own datastructures using ptr is trivial.

🙄Dude, I've been coding in C++ since 1991. I know all about manual memory management and its pros and cons. This proposal is not about unsafe. It's about enhancements to the effect system that could be used to build something like it, as well as other useful things.

Araq commented 3 years ago

There is considerable overlap with an unwritten RFC by me that would bring the not operator to Nim's effect system. The idea was like this:


type Dangerous = object of RootEffect

proc d() {.tag: Dangerous.}

proc disarm =
  {.cast(tags: not Dangerous).}: d()

Other examples like your DB transaction feel like arguing for "ghost" variables and DrNim.

planetis-m commented 3 years ago

Ok that's good, but why does nim-lang/Nim#16371 claimed (edited) that it was implementing this RFC?

Clonkk commented 3 years ago

I mentionned it was loosely related because it came from the same topic. I also mentionned it was nothing but a fun experiment. I often edit my posts because I have a bad habits of clicking send too soon.

It is a personal experiment that I made into a draft because the subject seemed to interest a lot of people on the forum. I also am, on a personal level, interested in other people's constructive criticism and input on the topic.

It most likely will never become a pull request because I don't have enough time (or knowledge probably) to dedicate to implement cleanly this feature.

Should the decision be taken by the core team to implement an effect related to memory safety, I'm sure they will come up with a better / cleaner solution.

I hope that clears any confusion my edit may have caused.

disruptek commented 3 years ago

Can't this already be done trivially with a macro?

And if so, aren't you just arguing for specialized syntax?

And if so, shouldn't we start with a library and consider bringing it into the compiler only when it has proven to contribute more than mere complexity?

Clonkk commented 3 years ago

I don't see how you can do that as a library if you want to be able to track unsafe effect such as cast, addr, unsafeAddr, importc (which was the original topic that led to this rfc) without modifying source code.

I'm not sure either how you could propagate a macro through an API.

And having to modify source code everywhere (even through proc call) to mark code as unsafe is undesirable (if you have to write unsafe everywhere might just use comment or a prefix on your proc name).

So unless I'm mistaken, it's either tag with additionnal features (cf. this RFC) or a new effect (idea I'm playing with that could very well be the wrong one but I'm having fun with it), which in both case imply compiler modification.

dom96 commented 3 years ago

Big +1 to this. I've been wanting something like this for a while, for me the motivating example has been the ability to mark synchronous IO with a SynchronousIO tag and then disallow it inside async procs.

Indeed, I recall discussing this with Araq and we landed on something similar to what he's written above. Hoping this gets implemented as the possibilities are great (it's also useful so that exception tracking is more flexible).

snej commented 3 years ago

I could help out a bit with implementation, if that would be useful — I have no experience with the Nim source, but I've got some background in compilers and have written some for simpler languages in the past.

Araq commented 3 years ago

Is my not Effect feature sufficient? It seems to be.

snej commented 3 years ago

The missing part seems to be the ability to block propagation of an effect tag; the allowing keyword in my proposal. Without this, you can't do anything like Rust's unsafe.

Araq commented 3 years ago

Hmm, why. This does what you want it to do:


proc picky() {.tags: not Dangerous.} =
  ...

snej commented 3 years ago

After re-reading your first comment: does {.cast(tags: not Dangerous).} mean that the function does not have the Dangerous effect even if it calls something that does have that effect? In that case it's the same as my allowing, and I agree that your proposal should suffice.

Araq commented 3 years ago

After re-reading your first comment: does {.cast(tags: not Dangerous).} mean that the function does not have the Dangerous effect even if it calls something that does have that effect?

Yes, the potential effects are stripped off.

snej commented 3 years ago

Is there an RFC yet for the "not effect" feature?

Araq commented 3 years ago

Unfortunately not. But you have my full support for "not effect", it's a tiny, logical addition. Don't forget to specify that either proc types that use "not E" are not allowed, or how co/contravariance for proc types would work for the "not" operator.

Lancer11211 commented 1 year ago

I've a PR for the first proposal here: https://github.com/nim-lang/Nim/pull/20050

metagn commented 1 year ago

forbids is implemented now, but I'm not sure what the rest of the proposal is; is it cast(tags: ...) or a {.forbids.}: ... block? Are either of those implemented?

Araq commented 1 year ago

forbids is implemented now, closing.

nim-lang / RFCs