louthy / language-ext

C# functional language extensions - a base class library for functional programming
MIT License
6.42k stars 416 forks source link

v5 resource tracking issue? #1325

Closed kirill-gerasimenko-da closed 4 months ago

kirill-gerasimenko-da commented 4 months ago

Hi!

I'm experimenting with the new v5 version and have expression with mixed Eff<A> and Eff<App, A>, something like:

var downloadAndConvert =
    from sourceFile in downloadFile("http://booksite.com/123/fb2", "fb2")
    from targetFile in newTempFile("epub")
    from convertCmd in convertBook(sourceFile, targetFile)
    from _ in runCommand(convertCmd)
    select (sourceFile, targetFile);

In the query above downloadFile and newTempFile have both used Resource.use and are both Eff<A>, so that when the expression finishes - these resources (actually temp files) are automatically deleted from the file system.

Everything worked as I would expect, but once I've introduced custom environment to Eff in convertBook - it stopped sharing resources, thus convertBook would have proper sourceFile and targetFile passed as usual, but the physical files were deleted. Before custom environment used in convertBook resources were not disposed.

I assume that going from Eff<A> to Eff<App, A> does not propagate the captured resources etc.

App looks this way:

public readonly struct App(EnvIO env, Config config) : HasIO<App>
{
    public App WithIO(EnvIO envIO) => new(envIO, Config);
    public EnvIO EnvIO { get; } = env;
    public Config Config { get; } = config;
    public override string ToString() => nameof(App);
}

I this something that should work this way? Or I'm missing some important bit here?

Thanks!

kirill-gerasimenko-da commented 4 months ago

Here's the file part:

public static class FileIO
{
    public static Eff<Unit> writeToFile(Stream stream, string filename) =>
        liftIO(rt => stream.CopyToAsync(File.Create(filename), rt.Token));

    public static Eff<TempFile> newTempFile(Option<string> extension = default) =>
        Resource
            .use<Eff, TempFile>(() =>
            {
                var newFile = TempFile.New();
                extension.IfSome(ext => newFile.ChangeExtension(ext));
                return newFile;
            })
            .As();
}

public class TempFile : IDisposable
{
    string _filename = Path.GetTempFileName();

    public void Dispose() => File.Delete(_filename);

    public override string ToString() => _filename;

    public Unit ChangeExtension(string extension)
    {
        _filename = Path.ChangeExtension(_filename, extension);
        return unit;
    }

    public static implicit operator string(TempFile tf) => tf._filename;

    public static TempFile New() => new();
}
louthy commented 4 months ago

First up, it's worth knowing that out of all of the changes I've made, the resource-tracking is what concerns me the most. It's currently untested and I may change the inner workings of it before release. So, if there are any bugs, it's most likely to be in there. The main reason for my concern is that internally it uses mutable state to track the resources -- this may be a mistake and it might need a more 'layered' approach.

I am currently dog-fooding v5 myself in my new project -- to find 'real world' bugs, but mostly to make sure the new features are as easy to use as possible -- and funnily enough I'm just getting to the resource-management stuff right now, so it's likely I'll start hitting any issues myself pretty soon.

In the example you gave of the border between Eff<RT, A> and Eff<A> there was an issue where the resources weren't flowing through. That should be fixed with 5.0.0-alpha.7 on nuget now (although again, no tests are available to confirm this yet).

If you are doing a deep dive into the resource management then I'd definitely appreciate feedback and updates on whether it works or not!

kirill-gerasimenko-da commented 4 months ago

Thank you for the quick response and fix!

Do I understand correctly that the fix you've made will work both ways? From Eff<Env, A> -> Env<A> as well as the original issue?

Also, I've noticed (and tested with simple disposable object), that you do not have to do run Resource.release manually if you are fine disposing the resource at the end of the computation. Is this a correct assumption? To me, using Resource.release explicitly looks similar to just calling .Dispose() explicitly, instead of using using () {} in regular code.

kirill-gerasimenko-da commented 4 months ago

I've checked the fix - works great, thank you again!

louthy commented 4 months ago

Do I understand correctly that the fix you've made will work both ways? From Eff<Env, A> -> Env<A> as well as the original issue?

That should be the case, yes. An Eff<A> is really a Eff<MinRT, A> so really the only thing that should happen when flowing from Env<RT, A> ←→ Env<MinRT, A> is the mapping between RT and MinRT.

Also, I've noticed (and tested with simple disposable object), that you do not have to do run Resource.release manually if you are fine disposing the resource at the end of the computation. Is this a correct assumption?

That's the correct assumption. The idea is that you can write functions that use resources and then return the acquired resource to the consumer of the function (knowing they're being tracked). Then, if the user of your function forgets to free the resource, the monad will do it automatically. The reason for calling Resource.release(r) is so you can manually free the resource at an earlier time (so you're not holding handles open for longer than necessary, for example).

However, if the resource isn't quite so sensitive to releasing early, you can just wait until the end of the computation. So, for things like web-requests which are, by definition, short computations. Just acquire the resources you need, without worrying about freeing them, knowing that they'll all be released at the end of the web-request.

An additional feature I want to add is: Resource.local(ma) which would create a local resource environment for the computation within. The idea being that any resources created inside the ma computation are tracked and released automatically once the local computation has completed. This is like a 'catch-all using' for a sub-system.

looks similar to just calling .Dispose() explicitly

To a certain extent that's true (although it works for non-disposables too). The reason for this approach is:

  1. To allow tracking of resources returned from functions
  2. To make the code more elegant:

This...

var result = from x in use(foo1)
             from y in use(foo2)
             from r in compute(x, y)
             from _ in release(y)
             from _ in release(x)
             select r;

...Is more elegant than...

    var result = use(foo1, 
                      x => use(foo2, 
                          y => compute(x, y))

Being able do write something like this should make it even more elegant when using many resources:

var result = local(from x in use(foo1)
                   from y in use(foo2)
                   from r in compute(x, y)
                   select r);

But making that work with the transformers is non-trivial.

kirill-gerasimenko-da commented 4 months ago

An additional feature I want to add is: Resource.local(ma) which would create a local resource environment for the computation within. The idea being that any resources created inside the ma computation are tracked and released automatically once the local computation has completed. This is like a 'catch-all using' for a sub-system.

If I get this correctly - this would be very appreciated feature, basically having the semantics of finally for a scope:

public void Method()
{
     using (var resource = new DisposableResource())
     {
           throw new Exception();
     }
}

// is equivalent to:

var method =
     from r in Resource.local<Eff, DisposableResource>(...)
     from _ in Fail(Error.New("Nasty Error"))
     .....
     select unit;

In both cases DisposableResource.Dispose will be called as with finally semantics.