Gabriella439 / pipes

Compositional pipelines
BSD 3-Clause "New" or "Revised" License
487 stars 72 forks source link

unwanted sharing and a new `oneShot` RULE for `await`? #185

Closed michaelt closed 7 years ago

michaelt commented 7 years ago

I was studying the difficulties about sharing, CSE and 'full-laziness' which @edsko was advancing in a blogpost here - but also in reddit discussions here and here, and in the ghc tracker here . It occurred to me to try a rewrite rule

 ; "await >>= f" forall f . await >>= f = Request () (oneShot f)

You can see it in situ in this branch.

This use of oneShot should in principle make every user-defined Pipe or Consumer unshared after the first use of await. This gist gives a separate implementation of the patch. You can toggle between using the patched oneShot await' and regular await and see the accumulation either in a Pipe or Consumer version.

Pathological sharing of Pipes and Consumers frequently fails to arise even where in principle it might with full-laziness, because of the way pipes writes Pipes and because of its battery of rewrite rules. But it is pretty easy to construct cases. I was thinking this patch was maybe too clever, and I had tested it only on stylized examples like those in the gist, and figured one should await word from the issue on the ghc tracker so I delayed raising an issue. ... But then there came up this trouble with pipes-text, also on reddit . Here the user was cleverly redeploying the same Pipe , called processor, on many different threads working with separate files. I suggested maybe trying this patched version and they reported success calming the problem down, so this emboldens me to recommend the patch. The solution seems surprisingly strong, though I haven't tried a million examples.

The same solution should by the way work for conduit sinks and intermediate conduits.

The difficulties would remain for the Producer case, which doesn't admit a solution like this as far as I can see. The compiler's behavior in pathological producer cases is however much more intelligible in my view - it is like making several references to numbers = [1..] in the same program. In any case, a program does not tend to read the same file again again, but raw reusable no-argument pipes p >-> myPipe >-> c are pretty common in practice.

Anyway I can produce a proper patch if the branch I mentioned looks good.

michaelt commented 7 years ago

I may be getting a counter-indication and will close this for the moment.

edsko commented 7 years ago

@michaelt I'm curious, what was the counter-indication? Just difficulties in getting the rule to fire reliably, or a more fundamental problem with the idea itself?

michaelt commented 7 years ago

No, there isn't any problem I have found. I just haven't tested it enough. What got me excited enough to raise the issue was the independently devised pipes-text program linked above from Reddit. At first s/he said it was improved, but the program had other problems and later they said it didn't make a difference, if I understood. That is, I just lost confidence that it contained a sharing problem like the ones you discussed. (It seemed like an interesting case for pathology since he was using the same named, user defined top level pipe on several concurrent threads. I take it that that could be really ugly if there were the kind of sharing you describe, maybe I should try to devise s case.)

As I said, the device does seems to survive a lot of compositions. I think it is also clear that it can do no harm, e.g. oneShot isn't itself expensive if I understand. (?) So I think it is certainly a good patch but would like to comprehend the problem better and have a few gruesome natural programs it cures.

edsko commented 7 years ago

Yup, I agree. I'm planning on further research as well :)