Run finalizers at the end of a cadence.

bigeasy commented 8 years ago

Currently, finalizers run when a function exits. This means that it is best not to put a finalizer in a loop. Every time the loop runs, a new finalizer will be added to a list an run at exit. If the loop runs a long time, then you'll be accumulating work, possibly starving your program of a resource so precious you've ensured it's release with a finalizer.

It used to be the case that a finalizer would run after a cadence had finished. A cadence is a series of asynchronous steps defined by passing them as arguments to async.

When finalizers run at the end of a cadence, it means that you can be constantly collecting garbage, not accumulating it. If you where to read through all the files in a directory, opening a file handle and closing the file handle in a finalizer, you would be closing the previous file handle before opening the next one. You'd use one file handle at a time. You would not be at risk of starving your program of file handles.

I removed this during the 0.2 refactor that brought about 0.4. It was extremely complex in 0.1. It was, however, useful, especially for cases like the file handle case above. It's also useful if you want to do something after a resource is release, like signal to waiting callbacks that they can proceed. You can put the allocation and finalizer in a sub-cadence and then in the step after the cadence with the sub-cadence you can release a Sequester lock, for example.

That is, in fact, where I found it useful in Locket.

Because I'm nailing down the interface for a 1.0.0 release, I went to look at what it would cost to add this behavior. What would be the cost in complexity? That is, how many more steps would it take to add a test for finalizers in each loop of the cadence? How much would it cost in girth? That is, how much larger would Cadence be minzipped?

The additional complexity is low. It will only add a single additional condition to loops and then only at the end of an iteration. This is the check to see if there are finalizers to run at the end of the iteration. It is more or less the same check that needed to be performed at the end of every cadence to determine if their where finalizers to run and if it was time to run them, which it would be only if the cadence was the function body cadence. Thus, two checks at the end of every cadence become one check at the end of every iteration of a cadence.

As far as girth goes, with finalizers on each iteration, Cadence has shrunk, from 1.71kb to 1.60kb. I reworked the code to use the Cadence language to create steps to run finalizers and exception handlers instead of constructing Cadence objects. This means reusing the Cadence language to create a short cadence instead of a verbose object construction.

Profiling shows a slow down that is not great, but not within the range of error. Adding an additional condition slows things down. However, I found cruft left over from when I had EventEmitter processing build into Cadence and I walked that back, so it might be best to think of it as filling the space left by an extra that was removed.

bigeasy commented 8 years ago

@mnkhouri Might be of interest. This probably would change any of your code.

mnkhouri commented 8 years ago

Were I learning Cadence from scratch, I think I'd expect finalizers to run after a cadence finishes. In the current system, creating a cadence function with a finalizer in it in order to get the finalizer to run after each loop invocation feels like a work-around. It's not a difficult work-around to understand, but I think it simplifies the usage of Cadence if finalizers are run at the end of a cadence, as suggested here.

bigeasy commented 8 years ago

Closed by 98cc2357fd57485eba843344a046674fd551b134.

bigeasy / cadence

Run finalizers at the end of a cadence. #338