null refcount while garbage collecting

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?
Unfortunately, I do not know how to reproduce the problem.
I Use pyev quite heavily, registering hundreds or maybe thousands of events 
including idle, read, write, child.
From time to time, every few hours of heavy run, I get an assertion failed from 
the Pyhton gcmodule.

What is the expected output? What do you see instead?

What I see is :
Modules/gcmodule.c:331: update_refs: Assertion "gc->gc.gc_refs != 0" failed.
object  : <refcnt 0 at 0x6e86680>
type    : pyev.Child
refcount: 0
address : 0x6e86680
Aborted (core dumped)

What I expect is obviously no assertion failed.

What version of the product are you using? On what operating system?

I use :
pyev : 8.1.4-4.04
Python : 2.6.6
uname -a : Linux xp02.mnd.fr 2.6.32-220.7.1.el6.x86_64 #1 SMP Tue Mar 6 
15:45:33 CST 2012 x86_64 x86_64 x86_64 GNU/Linux

Please provide any additional information below.

I am afraid that with so few information to reproduce, bug will be difficult to 
track.

Original issue reported on code.google.com by cesar.do...@gmail.com on 2 Jul 2012 at 7:23

GoogleCodeExporter commented 8 years ago

Well as you said, it could be really difficult to track this one.

Can you provide a test case (or can I have access to your code)?

Did you try and run the loop in debug mode?

Do you use Scheduler or Embed watchers (there might be a pb here, I just 
realised)?

Do you use some custom PyObject struct (in that case, and if I understand the 
comment in update_refs, do you incref/decref them correctly)?

...?

Original comment by lekma...@gmail.com on 5 Jul 2012 at 8:47

GoogleCodeExporter commented 8 years ago

Well, since I posted the issue, I have put a few prints in the pyev code to try 
to track the problem, but it did not show up since then. My code has evoluated 
as well, and maybe the conditions for the bug to show up are not met any more.

My code is quite complex, needs a rather heavy environment to run and hence is 
not practical to post. However, it is pure Python, no C code. I do not use 
Scheduler, I wrote my own one in Python, directly leveraging pyev (which is 
very nice, by the way). I only use the default loop.

The only point that could be tricky, is that I release the watcher during the 
call-back (i.e. its refcount goes from 1 to 0 during the call-back) and I see 
no reason not to do so (after all, once the callback has been fired, the 
watcher becomes useless).

By looking at the code, there is one access to the watcher after the call-back 
at Watcher.c:192 which accesses 'self->callback' in case there is an exception 
in the call-back.
In my case, I run in debug mode, so the 2nd argument is ignored by 
set_error_Loop, and hence, I thought this should not be the problem.

However, for better robustness, I think callback_Watcher (starting 
Watcher.c:156) should incref self before the call-back and decref it after the 
last access, e.g. after line 196.

Original comment by cesar.do...@gmail.com on 5 Jul 2012 at 8:18

GoogleCodeExporter commented 8 years ago

> The only point that could be tricky, is that I release the watcher during the
> call-back (i.e. its refcount goes from 1 to 0 during the call-back) and I see 
no
> reason not to do so (after all, once the callback has been fired, the watcher
> becomes useless).
That seems a likely candidate for the assert in update_refs.
Intuitively, I'd say a better strategy would be to reuse you watchers. This 
would definitely be a question for the libev mailing list (i.e. new watchers vs 
reuse).
Or maybe, stop your watchers in your callback and delete the stopped/useless 
watchers in a Prepare watcher callback.

> However, for better robustness, I think callback_Watcher (starting 
Watcher.c:156)
> should incref self before the call-back and decref it after the last access, 
e.g.
> after line 196.
You're right, I'll include this modification in the next release. Thanks for 
the suggestion.

Original comment by lekma...@gmail.com on 6 Jul 2012 at 8:47

GoogleCodeExporter commented 8 years ago

If I was convinced that the problem came from here, I would do it or fix the 
pyev code. But the fact is, I am sure of nothing. So I keep my code as is until 
the problem resurface.

Original comment by cesar.do...@gmail.com on 6 Jul 2012 at 11:45

GoogleCodeExporter commented 8 years ago

Also, I reuse watchers each time it is simple to code this way. But most of the 
time, I manage independent tasks, and reusing a watcher would ask me to manage 
a free list and to some extent redo Python's job.
I code in Python precisely because it does this book-keeping for me.
the simple thing I can do is to just maintain a global that points to the last 
watcher that fired, just to ensure it is not garbage-collected during the 
call-back (this would be essentially equivalent to my suggestion). I may do 
that when my code goes into production or when I finally succeed in reproducing 
the issue.

Original comment by cesar.do...@gmail.com on 6 Jul 2012 at 12:03

rlcjj / pyev

null refcount while garbage collecting #17