Beep6581 / RawTherapee

A powerful cross-platform raw photo processing program
https://rawtherapee.com
GNU General Public License v3.0
2.9k stars 323 forks source link

336 memory leaks. Gtk object not managed #646

Closed Beep6581 closed 9 years ago

Beep6581 commented 9 years ago

Originally reported on Google Code with ID 657

Valgrind run:

50% of all leaks related this case.

This command give list possible places where manage is missing.
grep -r -n "new Gtk" --include=*.cc . | grep -v "manage"

Result is attach file.

If main panels are planned NOT released in production code then maybe some special
code can added which can activated when valgrind is used.

like (command line option would be nice then more people can run valrind and get useful
results):
RTWindow::~RTWindow () {
int valrind = 0;
 if (valrind) {
 if (fpanel!=NULL) { delete fpanel; }
 if (epanel!=NULL)  { delete epanel; }
 if (bpanel!=NULL) { delete bpanel; }
}
}

Reported by GreatBull69 on 2011-04-26 02:44:41


Beep6581 commented 9 years ago
I guess that it's safe to delete them in production code too.

Which milestone should we set for this bug ?

Reported by natureh.510 on 2011-04-29 16:38:39

Beep6581 commented 9 years ago
I vote for removing them as soon as possible.

Solution is including the new for Gtk widgets in Gtk::mange( )?
Or it's better to make any distinctions?
Most important critical ones are IMHO in editorPanel exifpanel iptcpanel ...

Reported by ffsup2@yahoo.it on 2011-04-30 07:34:58

Beep6581 commented 9 years ago
I'm preparing a patch...
Saw a problem with PopUpCommon: its destructor is not called, so there are memory leaks!
First thing: it should be virtual; then there is something lacking in the chain of
classes, maybe someone should be Gtk:widget derived or attached in someway to the parent:
I've to investigate better.

Reported by ffsup2@yahoo.it on 2011-04-30 12:28:45

Beep6581 commented 9 years ago
PopUpButton and PopUpToggleButton are direct descendant classes of PopUpCommon and is
only instanciated by them. Those 2 subclasses are deleted thanks to the Manage mechanism,
so i think that PopUpCommon is deleted altogether!? Do we need to manually delete parent
classes ?

Reported by natureh.510 on 2011-04-30 13:16:58

Beep6581 commented 9 years ago
Here is a (partial) patch for 3.0
Please GreatBull, apply it and rerun your diagnose to see how much is left.

Reported by ffsup2@yahoo.it on 2011-04-30 14:08:01


Beep6581 commented 9 years ago
I run 
grep -r -n "new Gtk" --include=*.cc . | grep -v "manage"

30 case founded are in attached file

Windows user maybe interested this: http://gnuwin32.sourceforge.net/packages/grep.htm
I have not tested it.

#4. I feel not deleted if parent is not deleted. I little look code and endup here
where they looks like be deleted manually.
CurveEditorGroup::~CurveEditorGroup() {
    for (std::vector<CurveEditor*>::iterator i = curveEditors.begin(); i != curveEditors.end();
++i)
    {
        delete *i;
    }
    delete flatSubGroup;
    delete diagonalSubGroup;

Reported by GreatBull69 on 2011-05-01 15:44:48


Beep6581 commented 9 years ago
Thank you GreatBull, I'm a little busy: investigating a strange bug (leak) affecting
all panels classes that are not destroyed.

Anyway, not all pair new<->manage have to be corrected, because some widget are deleted
by explicit delete in code.

Reported by ffsup2@yahoo.it on 2011-05-02 07:00:30

Beep6581 commented 9 years ago
No any hurry.

Panel leak? Attached file maybe show something interesting (all panel related lines
with new or delete).
like ./rtgui/editorpanel.cc:38:    tpc = new ToolPanelCoordinator ();

grep -i -r -n "panel" --include=*.cc . | grep -P  "delete|new" >panel_3_0.txt

Reported by GreatBull69 on 2011-05-02 11:56:42


Beep6581 commented 9 years ago
I've searched all gui code, and fixed several leaks.
Not all gtk widgets must be managed( new ...), some are deleted inside class destructor.
I had to add a special class "Frame2" to have all panels correctly deleted (now the
destructor is called, while it was broken after an old commit ): honestly I don't know
why, the code should work without this, but doesn't.

I ask to a few devs to test and comment before applying.

One point remains in batchqueuepanel (I'm going to remove the button as discussed in
#570)

Reported by ffsup2@yahoo.it on 2011-05-04 22:10:32


Beep6581 commented 9 years ago
Did not get a chance to try this patch yet in branch3.0 

I wonder if this may fix at least a part of a large memory leak for this use case (tested
in DEFLOAT): navigate to a folder with about 300 NEF files. Let all thumbs be loaded.
Switch to Editor (single tab mode) tab. Switch back to file browser. monitor RAM usage:
in general, when switching from file browser to the editor RAM use goes down, but when
switching back to file browser it goes up to a higher value than was originally.

Fabio, do you think I should open a separate issue for this leak?

Reported by michaelezra000 on 2011-05-05 11:37:53

Beep6581 commented 9 years ago
Michael, I'm not sure what you describe is a memory leak.
If it were, memory should increase as time goes until there is a consistent difference
from initial state. Consider that the file browser (Gtk actually) use memory to display
thumbs; when some thumbs are not displayed some memory is released sometime.

Please test last patch on branch 3. When committed, I'll merge to default also.

Reported by ffsup2@yahoo.it on 2011-05-05 19:59:42

Beep6581 commented 9 years ago
Hi Fabio, I will test in branch 3.0 and report.

But what I describe IS a memory leak, as final RAM usage goes UP after every cycle
FileBrowser -> SingleTabEditor -> FileBrowser 

Reported by michaelezra000 on 2011-05-05 20:13:00

Beep6581 commented 9 years ago
Michael I've just tried branch 3. and I don't see memory increase you describe: maybe
it's better you open another issue.

Anyway with latest patch not all leaks are solved because there is a little memory
increase left after opening one image after another (less then 1M each time with multitabmode,
about 100k in single tab).
The worst thing abou these leaks is that they fragment memory and prevent later big
allocation to occurs.
Any suggesion is welcome.

Reported by ffsup2@yahoo.it on 2011-05-05 20:58:44

Beep6581 commented 9 years ago
@michael

I tried with Branch3 and didn't experienced your bug, but the memory was contantly
evolving between 210 and 212 MB at each swap though (i.e. it didn't increased at each
swap).

I didn't had the time to test your patch, but since you're on Windows too, i guess
that it will work seamlessly on my PC ;) Nice work btw !

Btw, i also noticed the memory conumption increase at each "Open Editor / Close Editor
/ Open Editor / ..." cycle.

Reported by natureh.510 on 2011-05-05 22:46:28

Beep6581 commented 9 years ago
Sorry, I have a difficulty testing as on my 32 bit machine (did not yet port branch
3.0 to 64 bit, need to recompile lcms) I have the following error when running branch
3.0: 
lcms: Error #12288: Corrupted memory profile

Reported by michaelezra000 on 2011-05-06 12:05:02

Beep6581 commented 9 years ago
From comment #11, be carefull Fabio, as patch committed to Default may be lost if they
are not applied to Defloat too ; i think it's already the case for your patch about
the progess bar. It will then be more difficult to merge (if it will occure one day...)

Reported by natureh.510 on 2011-05-06 14:28:07

Beep6581 commented 9 years ago
I read memLeaks3.diff and it looks very good. (I wait it applied to defloat)

For these 3 I didn't find release (probably I just didn't understand code).
./rtgui/saveasdlg.cc:58:    formatOpts = new SaveFormatPanel ();
./rtgui/rtwindow.cc:56:    fpanel = new FilePanel ();
./rtgui/filepanel.cc:80:    filterPanel = new FilterPanel ();

Very good job Fabio.

Reported by GreatBull69 on 2011-05-07 09:11:14

Beep6581 commented 9 years ago
I pushed the patch: others leaks are present for sure ( and wait to be spotted ...)

Reported by ffsup2@yahoo.it on 2011-05-09 21:41:45

Beep6581 commented 9 years ago
Thank's for your work Fabio!

I'll try to run Valgrind too ASAP to look for remaining leaks. I don't know how to
use it, so i'll maybe ask help from GreatBull ;)

Reported by natureh.510 on 2011-05-11 10:28:13

Beep6581 commented 9 years ago
You didn't ask but here this is

Normal interactive RT (GUI)
G_SLICE=always-malloc G_DEBUG=gc-friendly,resident-modules valgrind --tool=memcheck
--leak-check=full --leak-resolution=high --num-callers=50   --show-reachable=yes --track-origins=yes
--gen-suppressions=yes --log-file=rt_dump.txt ./rt 

Command line run this with default pp3 but can make special one too, no GUI, good for
rtengine tests
G_SLICE=always-malloc G_DEBUG=gc-friendly,resident-modules valgrind --tool=memcheck
--leak-check=full --leak-resolution=high --num-callers=50   --show-reachable=yes --track-origins=yes
--gen-suppressions=yes --log-file=rt_dump2.txt  ./rt -t -Y -c 'IMG.CR2' 

Few result files:

Command line run (these leaks are not fixed)
http://www.4shared.com/document/GAnsJhmW/rt_dump2.html

Interactive run (many of these are now fixed, just for reference)
http://www.4shared.com/file/iftP06rK/rt_dumptxt.html

Hint: press C when action stop. There is question suppress?

Reported by GreatBull69 on 2011-05-12 13:23:26

Beep6581 commented 9 years ago
@GreatBull:
are those files the result of old run, or are they obtained after latest patch (rev:cec19a533b4e)
?
I ask because I see several points in the report where code should be correct.
Otherwise I'm unable to interpret them ... or don't know how to resolve the leaks.

Reported by ffsup2@yahoo.it on 2011-05-12 18:32:21

Beep6581 commented 9 years ago
dump2: Command line run is begin this week and not have any memory leak correction patch,
but I feel these leaks are NOT corrected. All these one related rtengine code. (correction
are done to GUI code) Defloat branch.

dump: Interactive run is very old (that is where this case leak founded). So you will
be happy not see this any more (I hope most of them is gone) Default branch.

Reported by GreatBull69 on 2011-05-13 13:16:40

Beep6581 commented 9 years ago
I was planed to make new issue of dump2 cases, but didn't have time to done yet. But
as I see other are interest run valgrind and study more maybe I don't want spoil feeling
to find bug. Try look again. Ask again if not find I will tell then.

Maybe I give one example (go end of file)
operator new[]
rtengine::RawImageSource::correction_YIQ_LQ_

you find function and then look all 'new' lines and then check all deleted and you
find there is 5 pointer which are not released.

Reported by GreatBull69 on 2011-05-13 13:25:48

Beep6581 commented 9 years ago
here is another patch that solves some more leaks.
(To apply on top of latest revision 3.0)
But I'm not really satisfied ... there is much more.
I can't run valgrind on Windows XP (:-(

Gratbull, I don't understand the leak you are referring in RawImageSource::correction_YIQ_LQ_
... for me is ok!

Reported by ffsup2@yahoo.it on 2011-05-15 15:35:44


Beep6581 commented 9 years ago
memleak4: That I didnt see before (but most of them is gui part), 

Look first defloat branch because valgrind run in that (most code is same so 3_0 probably
have same problems)

These 5 looks suspicious (so many pointer at I little lost meaning of code)
void RawImageSource::correction_YIQ_LQ_  (Imagefloat* im, int row_from, int row_to)
{

  float** rbconv_Y = new float*[3];
  float** rbconv_I = new float*[3];
  float** rbconv_Q = new float*[3];
  float** rbout_I = new float*[3];
  float** rbout_Q = new float*[3];

2nd
 rtengine::Thumbnail::initGamma() 
  rtengine::cleanup(); is dont done command line case only gui code.

3rd
bool MultiLangMgr::load
looks tricky maybe  transTable and/or fallBack issue.

Is these comments make any sense?

Yes, there is still lot. How about first try fix what is in dump2.txt and try get current
fixes to defloat branch. When gui fixes are in defloat I plan run gui case again. 
And when dump2 case are fixed I run more complex command line run.

Reported by GreatBull69 on 2011-05-16 09:15:02

Beep6581 commented 9 years ago
Greatbull:
  float** rbconv_Y = new float*[3];
  float** rbconv_I = new float*[3];
  float** rbconv_Q = new float*[3];
  float** rbout_I = new float*[3];
  float** rbout_Q = new float*[3];
are deleted at the end by the template freeArray< >...

If you want to help me, please compile 3.0 with lastet patch and rerun valgring, then
upload the result. I can't run it because I'm on Windows and ... no I'll not fix defloat
or default or any other changing branch, 3.0 will not have any new features so it's
best candidate to clean first. Every fix will go to default merge if applies (I do
regular merges).

Reported by ffsup2@yahoo.it on 2011-05-16 12:52:44

Beep6581 commented 9 years ago
In default, such arrays should be converted to array2D class objects.  They will auto-destruct
at the appropriate time (if there are no bugs in array2D.h), eliminating the need for
this sort of housekeeping.

Reported by ejm.60657 on 2011-05-16 13:29:36

Beep6581 commented 9 years ago
Maybe not released (looks like someone break defloat branch):
template<class T> void freeArray (T** a, int H) {
    //for (int i=0; i<H; i++)
    delete [] a[0];
    delete [] a;
}

but other probably find from 3_0 too
and these
rtengine::ICCStore::createFromMatrix
rtengine::DFManager::init

maybe someday I will run, (Currently I try focus other project, but I answer just because
you ask)

Reported by GreatBull69 on 2011-05-16 13:54:54

Beep6581 commented 9 years ago
I make new valgrind run. Part of result are in http://paste2.org/p/1704460

I use latest code + my Lua patch.
Version: 4.0.3.23
Changeset: f49d3e5557a5

Whole file is 100MB (unzipped) 2.6M zipped. Need find place where to put it.

Main area looks like 
rtengine::ICCStore::init
ICMPanel::ICMPanel()
MultiLangMgr::load

RTWindow::remEditorPanel

 RTWindow::RTWindow()
Main loop:
 ImageArea::on_expose_event

Reported by GreatBull69 on 2011-10-12 06:30:49

Beep6581 commented 9 years ago
Full log in http://www2.zippyshare.com/v/52558477/file.html

This time test include Editor opening. In May run have only thumbnail browser. 
Valgrind run was many times faster now then May. I use smaller RAW file and many leaks
is fixed.

Analysis:
Huge improvement from previous run in May: 22422 -> 2371 contexts. I think this reflect
amounts of bugs fixed. 90% leaks is fixed.
Still reachable leak increased but I think that is because now test coverage is bigger.

LEAK SUMMARY:
definitely lost: 28,056 bytes in 73 blocks --> 73,862 bytes in 816 blocks
indirectly lost: 82,064 bytes in 2,070 blocks --> 159,369 bytes in 2,834 blocks
possibly lost: 2,540,159 bytes in 45,772 blocks --> 302,018 bytes in 5,954 blocks
still reachable: 6,861,882 bytes in 97,230 blocks --> 12,064,228 bytes in 102,584 blocks
ERROR SUMMARY: 24869 errors from 22422 contexts (suppressed: 11 from 9) --> 5874 errors
from 2371 contexts (suppressed: 20 from 9)

Reported by GreatBull69 on 2011-10-14 02:24:22

Beep6581 commented 9 years ago
As requested, here is a stacktrace of the problem I'm having in issue 1052:

rawtherapee(89550,0x7fff70903cc0) malloc: *** error for object 0x10733aec0: pointer
being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
(gdb) break malloc_error_break
Note: breakpoints 1 and 2 also set at pc 0x7fff852e2499.
Breakpoint 3 at 0x7fff852e2499
(gdb) continue
Program received signal:  “SIGABRT”.
(gdb) bt
#0  0x00007fff852540b6 in __kill ()
#1  0x00007fff852f49f6 in abort ()
#2  0x00007fff8520c195 in free ()
#3  0x0000000100457bda in rtengine::freeArray<float> (a=0x1061eae00, H=5344) at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtengine/stdimagesource.cc:35
#4  0x000000010042b62c in rtengine::RawImageSource::~RawImageSource (this=0x106d13540)
at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtengine/rawimagesource.cc:109
#5  0x0000000100457a27 in rtengine::ImageSource::decreaseRef (this=0x106d13540) at
imagesource.h:111
#6  0x000000010045d369 in rtengine::ImProcCoordinator::~ImProcCoordinator (this=0x102aed000)
at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtengine/improccoordinator.cc:85
#7  0x00000001003c1c03 in rtengine::StagedImageProcessor::destroy (sip=0x102aed000)
at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtengine/init.cc:68
#8  0x0000000100110b79 in EditorPanel::close (this=0x104d4a7e0) at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtgui/editorpanel.cc:461
#9  0x00000001001113e5 in EditorPanel::~EditorPanel (this=0x104d4a7e0) at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtgui/editorpanel.cc:320
#10 0x0000000100eb44b4 in g_datalist_clear ()
#11 0x0000000100e4fc97 in g_object_unref ()
#12 0x000000010115ac16 in gtk_notebook_forall ()
#13 0x00000001010b8917 in gtk_container_destroy ()
#14 0x0000000100e4ca48 in g_closure_invoke ()
#15 0x0000000100e62eb8 in signal_emit_unlocked_R ()
#16 0x0000000100e6478a in g_signal_emit_valist ()
#17 0x0000000100e64b74 in g_signal_emit ()
#18 0x0000000101163a68 in gtk_object_dispose ()
#19 0x0000000100e501af in g_object_run_dispose ()
#20 0x00000001010867cb in gtk_box_forall ()
#21 0x00000001010b8917 in gtk_container_destroy ()
#22 0x0000000100e4ca48 in g_closure_invoke ()
#23 0x0000000100e62eb8 in signal_emit_unlocked_R ()
#24 0x0000000100e6478a in g_signal_emit_valist ()
#25 0x0000000100e64b74 in g_signal_emit ()
#26 0x0000000101163a68 in gtk_object_dispose ()
#27 0x0000000100e501af in g_object_run_dispose ()
#28 0x00000001010b8917 in gtk_container_destroy ()
#29 0x0000000100e4c974 in g_closure_invoke ()
#30 0x0000000100e62eb8 in signal_emit_unlocked_R ()
#31 0x0000000100e6478a in g_signal_emit_valist ()
#32 0x0000000100e64b74 in g_signal_emit ()
#33 0x0000000101163a68 in gtk_object_dispose ()
#34 0x0000000100e501af in g_object_run_dispose ()
#35 0x0000000101c30b0f in Gtk::Window::~Window ()
#36 0x00000001000e96fa in RTWindow::~RTWindow (this=0x103e0a340) at rtwindow.h:30
#37 0x0000000100262e29 in main (argc=1, argv=0x7fff5fbff6f8) at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtgui/main.cc:163
(gdb) 

Reported by ejm.60657 on 2012-04-01 00:21:24

Beep6581 commented 9 years ago
When I comment out lines 108-115 of rawimagesource.cc (this is in the denoise branch),
the problem goes away, but apparently this causes a memory leak.

Reported by ejm.60657 on 2012-04-01 00:22:57

Beep6581 commented 9 years ago
I think that there would be benefits to convert allocArray to a real class that will
handle it's size and deletion automatically, and allocate aligned memory for speed
improvement. I'll look at this this week.

Reported by natureh.510 on 2012-04-02 00:43:32

Beep6581 commented 9 years ago
Great news. 

grep -P "temp.*allocArray" *
hlrecovery.cc:template<class T> T** allocArray (int W, int H) {
rawimagesource.h:template<class T> T** allocArray (int W, int H, bool initZero=false)
{
stdimagesource.cc:template<class T> T** allocArray (int W, int H) {

Reported by GreatBull69 on 2012-04-02 06:45:24

Beep6581 commented 9 years ago
#3  0x0000000100457bda in rtengine::freeArray<float> (a=0x1061eae00, H=5344) at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtengine/stdimagesource.cc:35
#4  0x000000010042b62c in rtengine::RawImageSource::~RawImageSource (this=0x106d13540)
at /Users/ejm/math/imaging/rawconvert/C_code/rawtherapee-ftdn/rtengine/rawimagesource.cc:109

Yes, ~RawImageSource destructor should probably call freeArray from rawimagesource.h,
not from stdimagesource.cc. The wrong one tries to free much more memory than the correct
one.

I haven't worked with c++'s namespaces too much. How safe it is to have two functions
with same name in one namespace. Is it safe as long as #includes are correct? Why it
calls the wrong destructor now - mac's compiler is broken? Or is this one those undefined
situtations and it has worked just by luck.. 

Reported by karpasten.herra on 2012-04-02 11:51:31

Beep6581 commented 9 years ago
Is there a reason why we can't convert all allocArray calls to array2D ?  The latter
is janrinze's array class that cleans up after itself.

Reported by ejm.60657 on 2012-04-02 12:35:43

Beep6581 commented 9 years ago
In the denoise branch, I converted the image data arrays rawData,red,green,blue in rawimagesource.cc
to array2D class objects, which should clean up after themselves properly. Please check
whether this fixes memory leaks in that branch.

Reported by ejm.60657 on 2012-04-03 06:22:51

Beep6581 commented 9 years ago

Reported by torger@ludd.ltu.se on 2013-11-05 11:40:38