nus-cs2103-AY1819S1 / forum

A repo to hold the issue tracker for module discussions
1 stars 1 forks source link

"Defensive programming should be avoided in non-mission-critical systems." Discuss. #127

Open btzy opened 5 years ago

btzy commented 5 years ago

I'm arguing that defensive programming (in the traditional, non-offensive-programming sense) should be avoided in non-mission-critical situations. This includes our current CS2103/T project.


Defensive programming (as defined by Wikipedia) is a design where code is written to protect against mistakes by other programmers (perhaps including yourself). Code can either (1) take measures so that the program can continue functioning somewhat even if there is a bug, or (2) produce a hard error (i.e. crash) when a bug is detected.

The former is used in mission-critical systems (e.g. life-support systems) where a hard error would be disastrous, and so it is better to attempt to recover from any detected bug as far as possible. Doing extra work to guard against possible bugs of another component (e.g. the example of MainApp#getConfig() from the CS2103 website) falls under this category, because we are purposely writing the code so that a bug in another component will not affect the current component (e.g. config), which means we are taking 'defensive' measures so that it is possible to recover in case of a bug in another component. The focus in this type of defensive programming is the graceful recovery in the face of bugs (we are either (a) trying to do recovery work after the bug has happened, or (b) taking preventive steps (e.g. the MainApp#getConfig() example) to insulate our code from the possible bug).

The latter means we write code that produces a hard error whenever we detect anything that even smells like a bug. Whether we do it by assertions (assert ...) or contracts (Eiffel or C++20 style language features) doesn't matter much, and the focus is that we crash the program immediately (perhaps with some error dump) so that the developers become aware of the bug and can work on a fix for it. According to Wikipedia, this is called offensive programming, and it is a branch of defensive programming because we explicitly write code to protect against bugs from other components (by crashing the program). (Note: both Eiffel and C++20 contract checking can be enabled or disabled just like assertions. When a contract is violated, it is only unspecified behaviour if contract checking is disabled.)

It does seem that the CS2103 website and the lecture notes take defensive programming to mean solely the former - i.e. to put in place measures so that the program can attempt to recover (i.e. avoid a crash) from a bug. (Is this right?) This is because many of the explanations made for defensive programming do not hold true for offensive programming.

In the exercise, it says the defensive programming can:

  1. can make the program slower.
  2. can make the code longer.
  3. can make the code more complex.
  4. can make the code less susceptible to misuse.
  5. can require extra effort.

All of the five points above are true of the former kind of defensive programming, but many of them are not true for the latter kind (i.e. offensive programming):

  1. can make the program slower - asserts can be disabled so this is not true
  2. can make the code longer - possibly
  3. can make the code more complex - not really, because assertions are quite lightweight as compared to writing logic to try to protect and recover from bugs
  4. can make the code less susceptible to misuse - definitely not, because assertions do precisely the opposite: they immediately stop execution when a bug is detected
  5. can require extra effort - yes

In addition, the CS2103 website places design by contract outside the scope of defensive programming, even though design by contract is a kind of formalized assertions.

Hence, I infer that CS2103 takes defensive programming to mean just the former kind. (Even if this was not the intended meaning, that's okay. I'm arguing that the former kind of defensive programming should be avoided for non-mission-critical systems.) I'll define defensive programming to mean just the former kind for the rest of this discussion.

Offensive programming (as described by the Wikipedia article) seems to be the premise of using assertions - we explicitly check for bugs in the code (especially at interfaces between different components), and as soon as we detect the slightest sign of a bug, we want the program to crash immediately so that the developers are made aware of the bug. Using assertions would then simply be the manifestation of offensive programming in the source code. (Note: Assertions should not be used on errors that are expected (e.g. file not found, invalid user input). Assertions should only be used to check for bugs in the code.)

Hence we can say defensive programming and assertions/offensive programming are two ends of the spectrum in terms of "how the program should respond to bugs". Defensive programming hides bugs (so that the program seems to work even though it is buggy), while assertions immediately reports bugs as they are detected (by crashing). Defensive programming makes testing the program more difficult, assertions make testing the program easier. Because it is harder to catch bugs when code is written defensively, bugs are more likely to go undiscovered. Hence assertions should be preferred over defensive programming.

The CS2103 website says that the suitable degree of defensiveness depends on many factors such as:

  1. How critical is the system?
  2. Will the code be used by programmers other than the author?
  3. The level of programming language support for defensive programming
  4. The overhead of being defensive

None of the reasons above seem compelling to me for using defensive programming for non-mission-critical systems:

  1. Because our program is not mission-critical code, the argument that crashing the program will lead to a disaster is probably moot for this project.
  2. The fact that the code is used by other programmers supports the case for offensive programming as well - other programmers may misunderstand the expectations of my code. In fact, we might even favour offensive programming because it will make other programmers aware immediately if they are using my code incorrectly.
  3. Without any examples of the lack of programming language support for defensive programming, I don't think this is true. All the defensive programming examples in the CS2103 website did not rely on particular language features of Java, and hence would be implementable in other general purpose programming languages. Actually, I might even argue that perhaps it is the lack of better language features that encourages programmers to fall back on defensive programming (and it is because we're stuck with Java that we might need to use it). The semantics of the Config example can easily be enforced at compile-time with a language that can propagate final semantics to fields of an object (e.g. C, C++). The null checks and referential integrity checks can be enforced with contracts in a language that supports them (Eiffel, C++20). In other words, defensive programming is an anti-pattern that we should try to avoid whenever possible.
  4. This isn't an argument supporting defensive programming.

In conclusion, there does not seem to be any argument for defensive programming in non-mission-critical projects. On the contrary, assertions / offensive programming / contracts (which focus on creating a hard error when a bug is detected) is beneficial.


So, supporters of defensive programming (in the traditional, non-offensive-programming sense), where did I go wrong?

damithc commented 5 years ago

Kudos for the lengthy and in-depth discussion points @btzy. I don't want to preempt the discussion; just a quick input only: Your explanation seems to be based on the view that defensive programming (as defined in CS2103) is about going into extra trouble to recover from errors so that the system can continue to operate (i.e., fault-tolerance). That's not correct. Defensive programming is about going into extra trouble to prevent errors; In the lecture, used the phrase 'proactively find holes, and plug them' e.g., if an object is not supposed to be modified, give out an immutable version of it; if there is a bidirectional link between objects, write code to set both links at together rather than trust others to set both links correctly, and so on. You are right that some defensive techniques can hide bugs; but others can prevent buggy code being written in the first place -- which may be preferable to an assertion failure at runtime, especially because assertions are not guaranteed to be enabled in the production environments.

ghost commented 5 years ago

Just checking...

Is this the same concept as the idea where you "imagine that the person maintaining your code is a psychopath who knows the address where you stay"?

I MAY or MAY NOT have asked this question during one of the tutorials before XD

damithc commented 5 years ago

Is this the same concept as the idea where you "imagine that the person maintaining your code is a psychopath who knows the address where you stay"?

Somewhat different, but I'm sure there is some underlying link. :-p

btzy commented 5 years ago

@damithc I'm not sure if I would fully agree with what you said:

if an object is not supposed to be modified, give out an immutable version of it

If some method modifies this object when its not supposed to do so, then it's undeniably not conforming to specification (and hence is a bug that should be fixed). We should only give out an immutable version of it if we are explicitly allowing modification.

if there is a bidirectional link between objects, write code to set both links at together rather than trust others to set both links correctly

If this is the case, then the method that sets the links should explicitly document that it sets both links together (and code that calls this method should be allowed to expect and rely on this behaviour).

If it was the latter, then I fully agree. I think its best that exactly what is expected of a method should be explicitly documented (and hence be relied upon), and the use of anything that is not explicitly allowed should be considered a bug (for example, if getConfig() does not guarantee to callers that they can freely modify the returned Config object, then any modification should (as far as possible) be made to crash the program (or be otherwise detectable)).

I guess my gripe would be when callers of the getConfig() method are told not to modify the returned object, but yet getConfig() makes a copy of the config. It seems like the caller and callee are unsure about what the agreed contract should be.

jackedelic commented 5 years ago

@damithc Can I confirm with you that the only way to "decide" the suitability of the degree of defensiveness for a particular situation is by writing more softwares? Seems like there is no way to satisfy everyone with regards to the certainty of the correctness of an approach over the alternatives. I guess it's more about the developer's intuition than about the exact degree of defensiveness (I could be wrong).

damithc commented 5 years ago

@damithc Can I confirm with you that the only way to "decide" the suitability of the degree of defensiveness for a particular situation is by writing more softwares? Seems like there is no way to satisfy everyone with regards to the certainty of the correctness of an approach over the alternatives. I guess it's more about the developer's intuition than about the exact degree of defensiveness (I could be wrong).

That's right. One has to make a judgment call in cases like these. SE is not an exact science. The more experienced you are, the more likely you'll make a reasonable choice.

damithc commented 5 years ago

If some method modifies this object when its not supposed to do so, then it's undeniably not conforming to specification (and hence is a bug that should be fixed). We should only give out an immutable version of it if we are explicitly allowing modification.

If it is possible to find all bugs and fix them in time, we don't need defensive programming at all, yes? :-) Why allow a bug to be created, detected (via an assertion failure, presumably), and then fixed when it can be prevented entirely from being written at all? A ReadOnlyPerson cannot be modified because there are no mutating methods. The caller doesn't need to read the contract to check if it is mutable or not. On the other hand, it's not easy to detect when an object is modified (when it is not supposed to), even if you use assertions liberally and even if you enable assertions in the production mode.

It seems like the caller and callee are unsure about what the agreed contract should be.

Yes, it is a case of callee not trusting the caller to do the right thing. Such mistrust is considered a virtue by the defensive programmer. Programmers are humans who can inadvertently violate the contracts of the callee (assuming the contract is well-specified and was read by the caller -- neither are guaranteed). Similarly, testers are humans who may fail to detect bugs. Every assertion failure in the field, even if it is not a safety-critical software, is a blow against the success of the product. e.g., a new game can flop because it crashes too often. As per the defensive programming approach, if everyone writes code that is less error-prone by default, the overall crash rate of the product can reduce.

Having said all that, defensive programming is not the last word; it's just an approach that has pros and cons. As for now, just know about it and keep it in your arsenal so that you can use it when you think it is useful.