JuliaParallel / ClusterManagers.jl

Other
242 stars 74 forks source link

Close and reopen stream after bpeek exits #183

Closed DrChainsaw closed 1 year ago

DrChainsaw commented 2 years ago

Fixes #182

Not sure is this is the right way to go about it. Rather than trying to handle a flood of empty strings we just close the stream and open a new one when we think that bpeek has terminated (i.e when bpeek outputs "Not yet started" ).

Not sure what will/should happen when stream is passed over to Distributed if user does not pass -f. Does it know what to do with empty strings or should we close the stream if the process has exited?

DrChainsaw commented 1 year ago

Bump.

I have tested this both with and without -f on my system and it works.

One potential problem with the code is if bpeek -f for some reason would output an empty string without having terminated. This could cause us to wait forever until it completes. I guess one could try to parse the bpeek_flags to understand if it is expected to terminate or not but I'm worried this will create surprising breakages without actually solving anything.

DrChainsaw commented 1 year ago

Thanks for the spot.

I agree about the maintainability. I'm trying to do my part as I'm using the LSF manager, but it is always good with a second pair of eyes.

Especially in this case I'm a bit uncertain what is the right way to handle a text stream as I find the correct behaviour of BufferStream a bit difficult to reason about. Can you always assume that empty string means the program terminated and if not, what is the correct way to determine whether it did? Maybe BufferStream is just the wrong tool here?

Anways, I will merge this tomorrow or so if there are no more comments. The current behaviour is at least better than just not working at all.

DrChainsaw commented 1 year ago

Thanks for the pull. I somehow managed to forget about this PR or else I would have merged it myself long ago.