mykelyk / cld2

Automatically exported from code.google.com/p/cld2
0 stars 0 forks source link

CLD should check result of "new" in all use cases #27

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
There are many uses of the "new" operator in the CLD source code, such as in 
scoreonescriptspan.cc's "new ScoringHitBuffer":

https://code.google.com/p/cld2/source/browse/trunk/internal/scoreonescriptspan.c
c#1168

There's no check that the "new" operator successfully allocated memory. In 
low-memory conditions this can lead to an access violation and subsequent crash.

The code should fail gracefully under low-memory conditions, though it isn't 
immediately obvious how to "gracefully" fail or how helpful it would be to the 
caller to have such behavior if they are truly out of memory.

Original issue reported on code.google.com by andrewha...@google.com on 7 Jan 2015 at 12:19

GoogleCodeExporter commented 9 years ago
I've thought about this issue at some length and I really don't see much in the 
way of graceful alternatives. Plumbing code everywhere to have a status return 
of something like OUT_OF_MEMORY seems like it would muddy all the APIs; the 
options seem to be either adding a new out-parameter for status (a-la ICU 
collator APIs and so on) or introducing some new "checkStatusOfLastCall" kind 
of thing, both of which sound less than ideal to me.

If we are willing to have less stragihtforward messaging, we could introduce a 
new API in compact_lang_det.h, e.g.:

  enum CLD2State { OK=0, OUT_OF_MEMORY=1 };
  // Each API method defined in this interface (other than this method and
  // lastCallOk(), below) sets the LastCallState before returning. This method
  // retrieves the most recently-set state that was set in this manner.
  CLD2State getLastCallState();

  // Convenience method that returns true iff getLastCallState() == CLD2State.OK.
  // Doesn't modify the state returned by getLastCallState().
  bool lastCallOk();

This doesn't perturb existing APIs, but is pretty ungraceful. On the other 
hand, many (most?) programs don't realistically expect (or care) about running 
out of RAM, and having a side-channel like this to check for the condition 
isn't necessarily bad. It allows programs that care about the issue to check 
for it and retry, if possible. It also doesn't preclude us from doing something 
more graceful in the future.

Original comment by andrewha...@google.com on 6 May 2015 at 12:09