albertogoffi / toradocu

Toradocu - automated generation of test oracles from Javadoc documentation
Other
42 stars 21 forks source link

Wrong translation and condition order (null dereference) #221

Open ariannab opened 6 years ago

ariannab commented 6 years ago

On class org.apache.commons.collections4.iterators.NodeListIterator, the following comment:

@param node Node, who's child nodes are wrapped by this class. Must not be null

is wrongly translated as: (node.hasChildNodes()) || ((node==null) == false)

while the right translation should be: ((node==null) == false) && node.hasChildNodes()

I think there are multiple problems here: 1) a likely wrong preprocessing by the ParamTranslator (to verify, might be another problem in the sentence processing); 2) a missing check over the final specification, in particular about the order of the conditions and the dereferences.

ariannab commented 6 years ago

I see 2 possible solutions for the null dereferencing issue:

1) always check for null before referencing something in a translation. So, if the translated condition isx.something(), we complete it as: (x!=null && x.something()) .

2) check when Toradocu explicitly produces a condition stating that a code element is !=null and give such condition the highest priority (put it as first in the translation if more conditions are involved). Plus, set the conjunction to AND: currently the conjunction is always se to OR when none is explicitly specified in the comment (line 61 of BasicTranslator)

I find the solution 1) legitimate and less confusing than 2).

albertogoffi commented 6 years ago

Example

I believe the correct translation of the comment

@param node Node, who's child nodes are wrapped by this class. Must not be null

is (node == null) == false).

The Javadoc comment has several errors, including a semantic one: a pre- and an exceptional post-condition conflict.


Null checks

Solution 1 (i.e., checking every time for null references) LGTM. The reason is that if developer predicate on a property of a referenced object, s/he is implicitly requiring a non-null reference. Are there comments that violate this assumption?

ariannab commented 6 years ago

After #234, the translation is: (((args[0]==null)==false) && (args[0].hasChildNodes()))

Should we close the issue?

albertogoffi commented 6 years ago

Should we close the issue?

I think we should keep this issue open because of the wrong && (args[0].hasChildNodes()) part of the generated specification.

ariannab commented 6 years ago

The Stanford Parser produces the following graph for the first part of the comment:

-> wrapped/VBN (root)
  -> Node/NNP (nsubjpass)
    -> who/WP (ref)
    -> nodes/NNS (acl:relcl)
      -> Node/NNP (nsubj)
      -> 's/VBZ (cop)
      -> child/NN (compound)
  -> are/VBP (auxpass)
  -> class/NN (nmod:agent)
    -> by/IN (case)
    -> this/DT (det)

The 's is not interpreted as possessive case, but as "is" (to be). Updating the Stanford Parser library to the most recent version (3.9.1) doesn't change the result.

Edit: actually, the problem is the sentence is grammatically incorrect. Who's can only mean who is or who has (thus Toradocu match would even be correct). The correct pronoun should be whose.