[BUG REPORT] Dupilcate problem name may exist in one theory file

Hi @albertqjiang, I found a small bug that might cause a failure in verifying the AFP proofs.

The code below from src/main/python/data_extraction/process_data is extracting proof by the problem name. However, some theory files in AFP may contain duplicate problem names (the statement is also the same). So the code below will always keep the last theorem and ignore the previous one.

    # Filter out all the transitions that are not in the proofs
    current_problem_name = None
    problem_name_to_transitions = {}
    proof_open = False
    for transition in good_transitions:
        _, transition_text, proof_level, _ = transition
        # print(transition_text, proof_level)
        if transition_text in problem_names:
            current_problem_name = transition_text
            assert proof_level == 0, transition
            problem_name_to_transitions[current_problem_name] = [transition]
            proof_open = True
        elif proof_level == 0:
            proof_open = False
            continue
        elif proof_open:
            problem_name_to_transitions[current_problem_name].append(transition)
        else:
            pass

    assert None not in problem_name_to_transitions
    assert set(problem_name_to_transitions.keys()) == set(problem_names)

However, the proceed_to_line function in src/main/scala/pisa/server/PisaOS.scala will always stop at the first occurrence of the problem name, making the proof context and the problem itself mismatch.

  def step_to_transition_text(
      isar_string: String,
      after: Boolean = true
  ): String = {
    var stateString: String = ""
    val continue = new Breaks
    Breaks.breakable {
      for (
        (transition, text) <- parse_text(thy1, fileContent).force.retrieveNow
      ) {
        continue.breakable {
          if (text.trim.isEmpty) continue.break
          val trimmed_text =
            text.trim.replaceAll("\n", " ").replaceAll(" +", " ")
          // break out on the isar_string's first occurrence
          if (trimmed_text == isar_string) {
            if (after) stateString = singleTransition(transition)
            return stateString
          }
          stateString = singleTransition(transition)
        }
      }
    }
    println("Did not find the text")
    stateString
  }

I will make a pull request later to fix this bug :). Since the duplicated thorems are not that common in AFP, I don't think this bug will cause that much problem.

albertqjiang / Portal-to-ISAbelle

[BUG REPORT] Dupilcate problem name may exist in one theory file #36