Closed megabytefisher closed 5 years ago
Still crashes (if uncommented).
It doesn't crash if one of pix.recylce()
or returnedPix.recycle()
is commented out: both share the same underlying nativePix object.
What's worse, the test completes successfully when I run it alone.
On the face of it, the cause is that pixaReplacePix()
does not increase pix->refcount
. I am not sure if this is a bug, though. @DanBloomberg, was it intentional?
Not a bug. The previous pix in the pixa is destroyed, and the pixa takes ownership of the new pix, so its refcount is unchanged:
So, it's kind of move operation? To avoid the mistakes like in the Java test, the replacing pix should somehow be marked invalid.
But I must say that I don't see why increasing refcount is less clean.
Leptonica is consistent in having a container own the objects in it. So when we put the pix in the pixa, the pixa owns it.
Now, we could put a copy (or clone) of the pix in, but then you'd have the old one lying around and have to remember to destroy it. Just asking for a memory leak. Not good. So by default, the pix is INSERTED into the pixa. If you look at pixaAddPix(), you'll see that you have several choices. I decided not to include those choices in this function, because it makes the interface (and its use) more complicated, and for no reason at all, because if you put the pix in a pixa, you almost certainly don't want to mess with it using the original handle.
To be more safe, I could have nulled the original handle by the function (by passing in its address and dereferencing inside), to prevent an accidental double free, but again, it seemed like excess complication.
There are no right answers -- but I've tried to be consistent in use of ownership while keeping the interfaces as simple as possible.
So, the fix for the Java test should be to set pix = null
and remove pix.recylce()
?
@DanBloomberg On the second thought, I would try to convince you that by default, pixaReplacePix() should be a CLONE operation. The fact that this test has been crashing for 3 years speaks for that. Nobody actually read the notes, even after their porgram crashes.
But most importantly, consider C++ RAII when a Pix is allocated to replace a pix in Pixa. When the program goes out of the block where the original Pix was created, it will naturally destroy it, causing the same problem as we see in this small Java test.
I don't see a scenario when replace would be called with a Pix that cannot be destroyed, or where calling such destroy would be counter-intuitive.
Leptonica has patterns that are implemented consistently, and one of them is the ownership of the object by the container. Also, unnecessary copies of images should be avoided, as well as clones because they require pixDestroy() on both.
To change the behavior of this function would unfortunately cause every existing invocation that is not crashing to have a memory leak!
However, your point is interesting, and not something I've run up against because people usually do not use std::unique_ptr when allocating pix. If you do use such a pointer, with the current implementation you would need to call release() on your ptr to get a new pix ptr that is not "smart", and use that in the function call.
So it's a judgment call. The best I can do now is to annotate the function with a caution not to use the content of a smart pointer, which by definition maintains ownership.
Dan
Note that the Java test that triggered this discussion does not use unique_ptr. The test goes like this:
Pixa pixa = new Pixa(…)
Pix pix = new Pix(…);
Box box = new Box(…);
pixa.replacePix(0, pix, box);
pix.recycle(); // this line is wrong
box.recycle(); // this line is wrong, too
pixa.recycle();
For some strange reason, box.recycle()
doesn't cause a crash in test. But it is equally wrong as pix.recycle()
. Essentially, after being used in replacePix()
, both pix and box are but zombies.
I will fix the Java test, and add your explanation to JavaDoc.
@alexcohn For some reason it works correctly without crash in my fork. I tried to use same Leptonica version in tess-two, but it still crashes there. Maybe the difference is caused by different compiler or some other build configuration?
@Robyer this is not a surprise, this crash caused by use-after-free bug did not happen for me in debug build, or when I ran this test single. It does not depend on version of Leptonica, because it is related to 'as designed' behavior of the library, see https://github.com/rmtheis/tess-two/issues/159#issuecomment-511828906.
@alexcohn You are right, it doesn't crash when running only the single test, but it crashes when I run all tests. Difference with my fork and current tess-two is that in my case it crashes in some other random test, but obviously this is the cause. So I'll use your fix too, thanks.
Unfortunately, now the test produces a warning Box was not terminated using recycle()
. It happens because the Java Box.mRecycled flag gets out of sync with the native refcount.
Same would have been reported for the pix, only there we don't have a finalize()
override to let us worry.
I must say that I am in doubt about the cleanest way to deal with this issue. The easiest would be to remove Box.finalize()
. This introduces another way to leak (native) memory, but it's not too bad, if you consider that there are no similar checks for Pix.
The damage that such finalize()
can cause, is worse. When Java decides that an object must be finally destroyed, it will operate on the underlying native object, which could be already deallocated (that's what would have happened in this testPixaReplacePix case before my fix).
The big problem is that the JVM will trigger such finalize()
at its own discretion, most likely when it goes low on memory. This means that an unexpected crash can easily pass the usual tests, and happen in production.
I hope that @DanBloomberg will help us choose the best route.
This has caused so much trouble for your java tests! My suggestion is not to use the offending leptonica function in your tests.
(If you really want a replace pix in pixa function, implement it in java.)
-- Dan
On Mon, Jul 22, 2019 at 5:21 AM Alex Cohn notifications@github.com wrote:
Unfortunately, now the test produces a warning Box was not terminated using recycle(). It happens because the Java Box.mRecycled flag gets out of sync with the native refcount.
Same would have been reported for the pix, only there we don't have a finalize() override to let us worry.
I must say that I am in doubt about the cleanest way to deal with this issue. The easiest would be to remove Box.finalize(). This introduces another way to leak (native) memory, but it's not too bad, if you consider that there are no similar checks for Pix.
The damage that such finalize() can cause, is worse. When Java decides that an object must be finally destroyed, it will operate on the underlying native object, which could be already deallocated (that's what would have happened in this testPixaReplacePix case before my fix).
The big problem is that the JVM will trigger such finalize() at its own discretion, most likely when it goes low on memory. This means that an unexpected crash can easily pass the usual tests, and happen in production.
I hope that @DanBloomberg https://github.com/DanBloomberg will help us choose the best route.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rmtheis/tess-two/issues/159?email_source=notifications&email_token=AD7KMLHBQDUWNCVCIOFOYJLQAWJ5PA5CNFSM4CIOD372YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2PTPQQ#issuecomment-513750978, or mute the thread https://github.com/notifications/unsubscribe-auth/AD7KMLGY2VPIJRKQLSRHC4LQAWJ5PANCNFSM4CIOD37Q .
@DanBloomberg my question now is not how to deal with this specific pixaReplacePix()
behavior. The finalizers could backfire in other scenarios where the (leptonica) refcount changes while the Java object is unaware.
If Pixa.replacePix()
is the single relevant case, it can be deprecated, and done with it. But if there are other situations, as hinted by removal of Pix.finalize()
, they require more serious approach.
These details are well above my pay grade :-) I believe you now understand the issues better than I do.
@rmtheis What do you think about this? Do you remember why @renard314 removed Px.finalize() ?
I'm creating this issue to discuss why testPixaReplacePix crashes in the JNI code, as mentioned here: https://github.com/rmtheis/tess-two/pull/157#issuecomment-227030406
The code referenced is here: https://github.com/rmtheis/tess-two/blob/a0e3b5e07a9b446f581f9b2f9e83f745ae753aeb/tess-two-test/src/com/googlecode/leptonica/android/test/PixaTest.java#L150
I am working on finding out exactly how to fix this, but the preliminary testing show that commenting out the following two lines fixes the crash:
returnedPix.recycle(); returnedBox.recycle();
I believe these Pix and Box get recycled and destroyed before they should be, and then when the Pixa tries to recycle them internally, it crashes. I am not 100% sure on this, and plan to test more.