pharo-project / pharo

Pharo is a dynamic reflective pure object-oriented language supporting live programming inspired by Smalltalk.
http://pharo.org
Other
1.21k stars 356 forks source link

Process>>#isTerminated is Wrong #2978

Closed guillep closed 1 year ago

guillep commented 5 years ago

A process that is on the bottom context, but whose context has already started but not terminated says that the process is terminated.

The following code (easily translated to a test) shows the problem. We start a process from a manually made bottom context. That context will stop in a semaphore. While that context is suspended there, the process says it is terminated, and this prevents us from resuming it (because resuming asks if the process is terminated).

However, the process is clearly not terminated! We have some code after the semaphore that should be executed.

masterSemaphore := Semaphore new.

context := [ 
    | semaphore |
    semaphore := Semaphore new.
    masterSemaphore signal.
    semaphore wait.

    "Some more code that should be executed, so this process should not be terminated yet."
    self error: 'Throw exception'

] asContext.

p := Process forContext: context priority: 40.
p resume.

masterSemaphore wait.
self deny: p isTerminated.
StephanEggermont commented 5 years ago

Broken since at least Pharo 2.0? Works in Squeak 5.2

guillep commented 5 years ago

Another way to reproduce it. newProcess will create a process with a bottom context doing [self value. Processor activeProcess]. So in the following

p := [ 1+2 ] newProcess.
p step. 
p isTerminated

The stepping operation will just push self to the stack, we are before the message send of value, but still the process says it is terminated.

aranega commented 5 years ago

@guillep with @Larcheveque we checked a little bit the issue, and we found a strange comparison in the #isTerminated method. There is many comparison checking if the suspended context is dead, and the final one checks if the suspended context's startpc is smaller than the it's pc. This looks weird, because it should be almost always the case, unless the process as not started.

isTerminated
    self isActiveProcess ifTrue: [ ^ false ].

    ^ suspendedContext isNil or: [
        "If the suspendedContext is the bottomContext it is the block in Process>>newProcess.
        If so, and the pc is greater than the startpc, the block has already sent and returned
        from value and there is nothing more to do."
        suspendedContext isBottomContext and: [
            suspendedContext isDead 
            "The pc of the suspendedContext is set to nil in #terminate explicitly.
                Leaving this line in for safety."
            or: [suspendedContext pc > suspendedContext startpc ] ] ]   "HERE"

We tried to remove it, but of course, it breaks many tests because the termination of processes seems not effective and so many processes are still considered alived. We really don't know which correction to apply, because we don't know exactly why this last comparison have been added and is doing the job in the common case.

guillep commented 5 years ago

Last week @Marmat21 and @dupriezt were looking into it and had more insights too. Maybe they can share them here?

stale[bot] commented 5 years ago

To limit bug bankruptcy (see https://www.joelonsoftware.com/2012/07/09/software-inventory/) this issue has been automatically marked as stale because it has not had any activity in 6 months. It will be closed in 1 month if no further activity occurs. If this issue remains important to you, please comment to reactivate the issue. Thank you for your contributions.

Joel on Software
Software Inventory
Imagine, for a moment, that you came upon a bread factory for the first time. At first it just looks like a jumble of incomprehensible machinery with a few people buzzing around. As your eyes adjus…
isCzech commented 3 years ago

This should fix the problem. Indeed the condition was weird because it was all wrong and made almost any bottom context terminated. All Process and Extended tests are passing (also fixed comment):

isTerminated
    self isActiveProcess ifTrue: [ ^ false ].

    ^ suspendedContext isNil or: [
        "If the suspendedContext is the bottomContext it is the block in BlockClosure>>#newProcess.
        If so, and the pc is greater than the startpc, the block has already sent and returned
        from value and there is nothing more to do."
        suspendedContext isBottomContext and: [
            suspendedContext isDead or: [
                "The pc of the suspendedContext is set to nil in #terminate explicitly.
                Leaving this line in for safety."
            suspendedContext closure 
                ifNil: [suspendedContext methodClass == Process 
                    and: [suspendedContext selector == #doTerminationFromYourself]]
                ifNotNil: [suspendedContext pc >= suspendedContext endPC]
            ] ] ]
dionisiydk commented 3 years ago

The fix is part of my PR #8567 (I did not see this issue originally)

MarcusDenker commented 1 year ago

The code examples above are now all working

So I guess we can close this issue?

isCzech commented 1 year ago

Yes, this issue has been fixed a while ago and can be closed. Reminder: There's, however, another PR #10644 pending approval, fixing some remaining termination issues.

Ducasse commented 1 year ago

Thanks. Yes we know about the issue and it is so central that we are always dropping it.

isCzech commented 1 year ago

Thanks, Stephane, for your note; I understand your concern, indeed. I can only suggest the same code was implemented in both Squeak and Cuis more than half a year ago and no issues have surfaced since. Although Pharo is 99% similar in this particular area I of course understand it doesn't guarantee something won't pop up. Let me know if and how I can help.

Ducasse commented 1 year ago

Thanks for your nice words and understanding. Sometimes our execution paths can differ. For example this is really the case on ephemerons because we use a lot of announcements so they stressed the system different and we found and fix many bugs there. So we really appreciate your efforts and feel bad that we did not react better. We are just always running after the next urgent things to be fixed. :(

isCzech commented 1 year ago

I can't even begin to imagine how much effort it is to maintain and advance a system like Pharo and how busy you all must be :) From my limited viewpoint, I'm just not happy I left #terminate in its current state with known shortcomings, all of which are addressed in the PR. Thanks again.

isCzech commented 1 year ago

Last note: my fear is once some code start "using" those shortcomings it'll be more and more difficult to fix them.