#180 Use a service for Requirement Title and Description expression

GlennPlou commented 1 month ago

After #193, we've chosen to limit the impact our changes can have. An htmltoText() service has been added to format a String ( in particular the ReqIFText attribute of a Requirement), and the LabelHelper.transformHTMLToText() post-process used for all fields has been removed for display in diagrams.

GlennPlou commented 1 month ago

Hi @pdulth, here is our final proposal for issue #180. This patch allows users to convert HTML content to text within their diagrams. The new htmlToText() operation has been incorporated into the default AQL expression and documented, enabling users to update their custom expressions manually if they choose. If they don't update their expression, they'll see HTML tags if there are any, or if they've used the RichText description.

sonarcloud[bot] commented 1 month ago

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

zgrtls commented 1 month ago

If all the expressions pinpointed by @pdulth should output pure text and not HTML, why should the user need to be aware of this and call a new additional method ? In addition to making it more complex for the user, no migration has been added to the add-on to ensure that the needed calls to this new mandatory method is added to the existing preferences/settings.

As far as I have seen, CapellaRequirementsOpenJavaService.getRequirementTitle() is expected to return a single line of pure text and CapellaRequirementsOpenJavaService.getRequirementContent() is expected to return multiple lines of pure text. As CapellaRequirementsOpenJavaService.evalutateExpression() is only called by these two methods, why couldn't the method always transform the HTML into pure text and transform or not the BR into linefeeds through a new additional parameter ?

ldelaigue commented 3 weeks ago

If all the expressions pinpointed by @pdulth should output pure text and not HTML, why should the user need to be aware of this and call a new additional method ? In addition to making it more complex for the user, no migration has been added to the add-on to ensure that the needed calls to this new mandatory method is added to the existing preferences/settings.

As far as I have seen, CapellaRequirementsOpenJavaService.getRequirementTitle() is expected to return a single line of pure text and CapellaRequirementsOpenJavaService.getRequirementContent() is expected to return multiple lines of pure text. As CapellaRequirementsOpenJavaService.evalutateExpression() is only called by these two methods, why couldn't the method always transform the HTML into pure text and transform or not the BR into linefeeds through a new additional parameter ?

The reason why we've had to introduce explicit conversion of HTML to plain text via a java service is that the AQL expression used in the default value of the Requirement contents uses ->sep('\n') to add line breaks in the returned text. If the result of this AQL expression contains HTML and needs to be converted to plain text, then all the '\n' will be converted to spaces because in HTML, '\n' does not mean line break. As an example, you can save the HTML below in a file and open it in a web browser, it will display 'Hello World!' with just one space between 'Hello' and 'World', even though the HTML contains line breaks, tabs and spaces between these 2 words. So, for this approach to work, the AQL expression would need to be modified to use ->sep('<br/>') instead of ->sep('\n'), which also impacts the end users (
is the HTML element that represents a line break). In short, we haven't found any clean solution that would not impact the AQL expression, so we have proposed the solution that seemed to us to be the cleanest.

<html>
<head>
</head>
<body>
Hello

World!
</body>
</html>

pdulth commented 1 day ago

In another tool, we use something like getRequirementTitle and getRequirementContent that get rid or not of the endofline. Can you look if something similar would fit as aql modifications might be impacting for users as contributing to migration seems difficult.


  public String getRequirementTitle(EObject self) {
    return evaluateExpression(element, xxxxx.getExpression(), xxxxx.getMaxLength(), 
        false);
  }

  public String getRequirementContent(EObject self) {
     return evaluateExpression(element, xxxxx.getExpression(), xxxxx.getMaxLength(), 
        true);
  }

///

  private static String evaluateExpression(EObject element, String expression, int maxLength, boolean keepEndl) {
    try {
      Session session = SessionManager.INSTANCE.getSession(element);

      if (session != null && expression != null) {
        IInterpreter interpreter = session.getInterpreter();
        if (interpreter != null) {
          Object value = interpreter.evaluate(element, expression);
          StringBuilder resultBuilder = new StringBuilder();
          if (value instanceof List<?>) {
            for (Object item : (List<?>) value) {
              resultBuilder.append(item);
            }
          } else {
            resultBuilder.append(value);
          }
          String evaluationResult = resultBuilder.toString();
          String sanytizedResult = getTextFromHtml(evaluationResult, keepEndl);
          return reduceString(sanytizedResult, maxLength);
        }
      }
    } catch (EvaluationException ex) {
      return "<Undefined>";
    }
    return "<Undefined>";
  }

  private static String reduceString(final String value, final int maxLength) {
    if (value.length() > maxLength) {
      return value.substring(0, maxLength).concat("...");
    }
    return value;
  }

  public static String getTextFromHtml(String html, boolean keepMultiline) {
    if (html == null) {
      return null;
    }

    Html2TextVisitor visitor = new Html2TextVisitor(keepMultiline);
    Document htmlDoc = Jsoup.parse(html);
    new NodeTraversor(visitor).traverse(htmlDoc);

    String text = visitor.toString();
    text = text.trim();
    return text;
  }

  private static class Html2TextVisitor implements NodeVisitor { 

    private static final Pattern MULTI_BLANKS_PATTERN = Pattern.compile("\\p{javaWhitespace}+");

    private final StringBuilder str = new StringBuilder();
    private final boolean keepMultiline;

    public Html2TextVisitor(boolean keepMultiline) {
      this.keepMultiline = keepMultiline;
    }

    private void addNewLineIfNoneBefore() {
      if (str.length() == 0 || str.charAt(str.length() - 1) == '\n') {
        return;
      }

      str.append('\n');
    }

    private void addBlankIfNoneBefore() {
      if (str.length() == 0 || Character.isWhitespace(str.charAt(str.length() - 1))) {
        return;
      }

      str.append(' ');
    }

    private void addBlankOrNewLineIfNoneBefore() {
      if (keepMultiline) {
        addNewLineIfNoneBefore();
      } else {
        addBlankIfNoneBefore();
      }
    }

    @Override
    public void head(Node node, int depth) {
      String name = node.nodeName();
      if (node instanceof TextNode) {
        String internalText = ((TextNode) node).text();
        // Internal text does not contains any formatting (as defined by HTML except inside "pre" tag) so multiple
        // blanks are equals to 1 and newline and stuff is the same, so replace them all by a single space
        // pre tags does not seem to be used by Capella editors so do not handle it
        String pureText = MULTI_BLANKS_PATTERN.matcher(internalText).replaceAll(" ");
        pureText = pureText.trim();
        str.append(pureText);
      } else if (name.equals("li")) {
        // Basic handling of list items (no difference between numbered and dotted nor depth)
        if (keepMultiline) {
          addNewLineIfNoneBefore();
          str.append(" * ");
        } else {
          addBlankIfNoneBefore();
          str.append("* ");
        }
      } else if (Arrays.asList("p", "h1", "h2", "h3", "h4", "h5", "tr").contains(name)) {
        addBlankOrNewLineIfNoneBefore();
      }
    }

    @Override
    public void tail(Node node, int depth) {
      String name = node.nodeName();
      if (name.equals("br")) {
        str.append(keepMultiline ? '\n' : ' ');
      } else if (Arrays.asList("li", "p", "h1", "h2", "h3", "h4", "h5").contains(name)) {
        addBlankOrNewLineIfNoneBefore();
      }
    }

    @Override
    public String toString() {
      return str.toString();
    }
}

eclipse-capella / capella-requirements-vp

#180 Use a service for Requirement Title and Description expression #194

Quality Gate passed