fmidue / logic-tasks

0 stars 1 forks source link

Implement minDepth in SynTreeConfig #111

Closed nimec01 closed 5 months ago

nimec01 commented 6 months ago

This adds a new minDepth config option, as described in #77. The "config checker" probably needs to be a bit more restrictive, since there are cases that fail to generate a SynTree with a valid config (probably because of how the generator assembles the tree). However, the current implementation does not break anything when using the default configs and passes the tests.

I also added a test case that checks whether the default config passes the config check.

jvoigtlaender commented 6 months ago

About the generation sometimes failing: Do you have pbserved specific configuration settings where this happens?

I'd venture to as much as possible (or maybe even completely) prevent such cases by the following measures:

(The third item is quite speculative, actually, since I do not really know/remember in detail what that property there is checking. :smile:)

jvoigtlaender commented 6 months ago

The "third item" would probably make the "first item" (rejecting minNodes < minDepth or maxNodes < minDepth) superfluous to have at all.

nimec01 commented 6 months ago

To be honest, I don't really understand the concept/reasoning behind the third item.

nimec01 commented 6 months ago

One particular config that leads to problems is:

SynTreeConfig
  { minNodes = 4
  , maxNodes = 15
  , minDepth = 4
  , maxDepth = 4
  , usedLiterals = "ABCDE"
  , atLeastOccurring = 2
  , allowArrowOperators = False
  , maxConsecutiveNegations = 5
  , minUniqueBinOperators = 1
  }

Sometimes this config provides a valid tree, but sometimes it gets caught up in an endless loop. This config is still valid after applying the first two items mentioned in one of the comments above.

jvoigtlaender commented 6 months ago

The aim of the "third item" (in the form it currently is already in the code, referring to maxNodes) is to express some constraints on the sizes that come from "how many negations are allowed". For example, if no negation at all is allowed, then all inner nodes in a tree have to be binary. Hence, the tree sizes will grow quite quickly with the depth. Having maxDepth = 4 then entails that it does not make sense to have maxNodes = 7, for example. Since if one wants at most 7 nodes, then one should be content with depth 3. On the other hand, if negation is allowed (i.e., maxConsecutiveNegations > 0), then it does make sense to have maxNodes = 7 and maxDepth = 4. Since then there is indeed a tree with 7 nodes and depth 4.

I'm not sure the expression that was put in the code there is the best/tightest check possible for this, but the above is its purpose anyway.

So, I thought it might make sense to also transfer that check to the minDepth and minNodes settings (which is not to say that it will necessarily fix the nontermination problems you encountered).

jvoigtlaender commented 6 months ago

One additional check needed that becomes apparent from your example is that maxDepth = 4 and maxConsecutiveNegations = 5 should not be allowed to appear together. It should always be the case that maxConsecutiveNegations < maxDepth.

jvoigtlaender commented 6 months ago

Okay, I've looked at it a bit more.

I do think that the additional checks ("third item" and ensuring maxConsecutiveNegations < maxDepth, and rejecting minNodes > maxNodesForDepth minDepth) should be added.

But even that won't fix the problem competely (I think already your example indicates that). There is an issue with the specific "generate-and-test" structure in genSynTree. It currently has this form:

  do
    nodes <- choose (minNodes, maxNodes) `suchThat` someCondition
    sample <- syntaxShape nodes maxDepth allowArrowOperators hasNegations
      `suchThat` anotherCondition
    usedList <- -- continue with what we have generated thus far

The problem, I think, is that there can be a choice for nodes that satisfies someCondition but makes it impossible for syntaxShape nodes to successfully make anotherCondition true. In other words, the nodes value to which the first line commits may not be something that can be completed for the rest of the generation. But the way the code is structured right now makes it impossible to revise the commitment to that nodes values once it is chosen.

I hope some restructuring as follows will cure the situation:

  do
    sample <-
      (do nodes <- choose (minNodes, maxNodes) `suchThat` someCondition
          syntaxShape nodes maxDepth allowArrowOperators hasNegations)
        `suchThat` anotherCondition
    usedList <- -- continue with what we have generated thus far
nimec01 commented 6 months ago

I hope some restructuring as follows will cure the situation:

  do
    sample <-
      (do nodes <- choose (minNodes, maxNodes) `suchThat` someCondition
          syntaxShape nodes maxDepth allowArrowOperators hasNegations)
        `suchThat` anotherCondition
    usedList <- -- continue with what we have generated thus far

Changing this works perfectly fine. The problems I encountered with the config I provided above are now gone.

jvoigtlaender commented 6 months ago

Given the substantial overall changes to the config checker here, can you add tests that check the validBoundsSynTree and invalidBoundsSynTree generators do indeed generate valid and invalid configurations, respectively?

jvoigtlaender commented 6 months ago

(Actually, invalidBoundsSynTree is never used by anybody at all?)

nimec01 commented 6 months ago

(Actually, invalidBoundsSynTree is never used by anybody at all?)

Apparently not.

jvoigtlaender commented 6 months ago

Drop it, then. Probably in a separate pull request. Along with any other stuff flagged by https://github.com/fmidue/logic-tasks/commit/ffecc490d03f3d0d40096155463f84024d8d8199.

nimec01 commented 6 months ago

Given the substantial overall changes to the config checker here, can you add tests that check the validBoundsSynTree and invalidBoundsSynTree generators do indeed generate valid and invalid configurations, respectively?

validBoundsSynTree in its current form does not generate valid configs. The following implementation does so:

validBoundsSynTree :: Gen SynTreeConfig
validBoundsSynTree = do
  allowArrowOperators <- elements [True, False]
  maxConsecutiveNegations <- choose (0, 3)
  usedLiterals <- sublistOf ['A' .. 'Z'] `suchThat` (not . null)
  atLeastOccurring <- choose (2, fromIntegral $ length usedLiterals)
  minNodes <- choose (atLeastOccurring * 2, 50) `suchThat` \minNodes' -> maxConsecutiveNegations /= 0 || odd minNodes'
  let minDepth = 1 + floor (logBase (2 :: Double) $ fromIntegral minNodes)
  maxDepth <- choose (max (maxConsecutiveNegations + 1) minDepth, 10)
  minUniqueBinOperators <- choose (1, min (fromIntegral $ length [minBound .. maxBound :: BinOp]) maxDepth - 1)
  maxNodes <- choose (minNodes, maxNodesForDepth maxDepth) `suchThat` \minNodes' -> maxConsecutiveNegations /= 0 || odd minNodes'
  return $ SynTreeConfig {
    maxNodes,
    minNodes,
    minDepth,
    maxDepth,
    usedLiterals,
    atLeastOccurring,
    allowArrowOperators,
    maxConsecutiveNegations,
    minUniqueBinOperators
  }

However, this leads to new problems. The config checker still might not be as strict as it needs to be. Several test cases using validBoundsSynTree (LegalPropositionSpec for example) are failing after this change (due to non-termination?).

jvoigtlaender commented 6 months ago

Ah, interesting. I guess we should take a closer look at a specific configuration that leads to the generator failing/hanging. Hopefully, insight can be gained from that about what the issue is.

jvoigtlaender commented 6 months ago

Are you sure the above does only generate configs satisfying the current checker, though?

These lines:

  usedLiterals <- sublistOf ['A' .. 'Z'] `suchThat` (not . null)
  atLeastOccurring <- choose (2, fromIntegral $ length usedLiterals)

look like they can generate usedLiterals = "A" along with atLeastOccurring = 2. But that should already be forbidden by the checker.

nimec01 commented 5 months ago

Are you sure the above does only generate configs satisfying the current checker, though?

These lines:

  usedLiterals <- sublistOf ['A' .. 'Z'] `suchThat` (not . null)
  atLeastOccurring <- choose (2, fromIntegral $ length usedLiterals)

look like they can generate usedLiterals = "A" along with atLeastOccurring = 2. But that should already be forbidden by the checker.

The cases generated by QuickCheck weren't diverse enough to reveal that. This does not fix the problems, though. Here are some problematic configs:

SynTreeConfig {
minNodes = 25, 
maxNodes = 149, 
minDepth = 5, 
maxDepth = 9, 
usedLiterals = "EFGOQUWXYZ", 
atLeastOccurring = 10, 
allowArrowOperators = False, 
maxConsecutiveNegations = 1, 
minUniqueBinOperators = 4
}

SynTreeConfig {
minNodes = 31, 
maxNodes = 142, 
minDepth = 5, 
maxDepth = 8, 
usedLiterals = "BFJMNOQRSTUWXZ", 
atLeastOccurring = 13, 
allowArrowOperators = False, 
maxConsecutiveNegations = 1, 
minUniqueBinOperators = 4
}

SynTreeConfig {
minNodes = 21, 
maxNodes = 67, 
minDepth = 5, 
maxDepth = 7, 
usedLiterals = "ABCFGHIKLNUX", 
atLeastOccurring = 1, 
allowArrowOperators = False, 
maxConsecutiveNegations = 0, 
minUniqueBinOperators = 4
}

SynTreeConfig {
minNodes = 20, 
maxNodes = 648, 
minDepth = 5, 
maxDepth = 10, 
usedLiterals = "BDEFGHLMNQSUVWXYZ", 
atLeastOccurring = 8, 
allowArrowOperators = False, 
maxConsecutiveNegations = 1, 
minUniqueBinOperators = 4
}
jvoigtlaender commented 5 months ago

Could the underlying problem simply be that

allowArrowOperators = False, 

contradicts

minUniqueBinOperators = 4

?

nimec01 commented 5 months ago

Could the underlying problem simply be that

allowArrowOperators = False, 

contradicts

minUniqueBinOperators = 4

?

Yes. The config checker did not handle the special case when allowArrowOperators = False. Most of the test cases work now. However, the new implementation of validBoundsSynTree seems to generating SynTrees with the same operand next to each other (sameAssociativeOperatorAdjacent). QuickCheck then does not find enough cases and gives up. The test case could be fixed fairly easy, but I don't know if there is more to it.

jvoigtlaender commented 5 months ago

I'm not sure what you mean by

the new implementation of validBoundsSynTree seems to generating SynTrees with the same operand next to each other (sameAssociativeOperatorAdjacent)

Are you talking about the generator of syntrees? Wasn't that able to generate trees satisfying sameAssociativeOperatorAdjacent already all along?

nimec01 commented 5 months ago

I wanted to say that there is some sort of problem with this test case. QuickCheck cannot "find" enough generated SynTrees that satisfy not sameAssociativeOperatorAdjacent. This could either be a problem with the new config generator used in the tests or the actual tree generator itself, although I don't think it's the latter.

jvoigtlaender commented 5 months ago

I think it makes sense to simply constrain here https://github.com/fmidue/logic-tasks/blob/5cdede4321e06a0a589ca913f5629a16e4af4e6a/test/SuperfluousBracketsSpec.hs#L79-L83 that only valid configs up to a certain depth are considered (for this specific test case). Because with larger depth it becomes quit unlikely that one doesn't have adjacent same operators.

So, probably something like forAll (validBoundsSuperfluousBrackets `suchThat` (k >) . maxDepth) with suitably chosen fixed k.