SemanticMediaWiki / SemanticResultFormats

Provides additional visualizations (result formats) for Semantic MediaWiki
https://www.semantic-mediawiki.org/wiki/Extension:Semantic_Result_Formats
Other
45 stars 75 forks source link

Handle unusual printout order and property chains in 'graph' format #744

Closed alex-mashin closed 1 year ago

alex-mashin commented 1 year ago

Handle unusual printout order in 'graph' format, i.e., the one with main column absent or not the first printout.

Per @YOUR1: added 'graphfields' parameter (off by default) that switches non-page properties display to labels within nodes from edges.

Property chains are displayed as edges regardless of the type of the last link.

Unit tests for the 'graph' format rewritten and extended.

Bug: #742

codecov-commenter commented 1 year ago

Codecov Report

Merging #744 (72bfe0d) into master (34138f1) will increase coverage by 0.76%. The diff coverage is 25.86%.

@@             Coverage Diff              @@
##             master     #744      +/-   ##
============================================
+ Coverage     39.59%   40.35%   +0.76%     
- Complexity     2054     2061       +7     
============================================
  Files            75       75              
  Lines          7251     7268      +17     
============================================
+ Hits           2871     2933      +62     
+ Misses         4380     4335      -45     
Impacted Files Coverage Δ
src/Graph/GraphPrinter.php 0.00% <0.00%> (ø)
src/Graph/GraphOptions.php 95.45% <33.33%> (+95.45%) :arrow_up:
src/Graph/GraphFormatter.php 96.29% <87.50%> (+18.22%) :arrow_up:
src/Graph/GraphNode.php 100.00% <0.00%> (+22.72%) :arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

YOUR1 commented 1 year ago

What if the data of $type is only a _txt?

Nothing will be done with the node then, even tho you want to render it.

alex-mashin commented 1 year ago

What if the data of $type is only a _txt? Nothing will be done with the node then, even tho you want to render it.

A text could be a label on a node, representing a page, but it is pointless to render a non-page property as a graph node. A "graph" consisting of disconnected texts is meaningless.

A semantic query that can be meaningfully rendered as a graph should have the ? printout, explicit or implicit.

Moreover, previously (since the inception of this format, I guess), it was assumed that ? was the first printout,

YOUR1 commented 1 year ago

While your merge request fixes the fatal error, it doesn't 'fix' the actual problem.

I've got the following usecase: Screenshot 2022-11-08 at 15 45 31

The fix suggested above doesn't show anything at all. In fact, the whole graph stays empty.

YOUR1 commented 1 year ago

Also; reverting back to the old way the processResultRow was done:

// loop through all row fields
foreach ( $row as $i => $resultArray ) {

    // loop through all values of a multivalue field
    while ( ( /* SMWWikiPageValue */
        $object = $resultArray->getNextDataValue() ) !== false ) {

        // create SRF\GraphNode for column 0
        if ( $i == 0 ) {
            $node = new GraphNode( $object->getShortWikiText() );
            $node->setLabel( $object->getPreferredCaption() );
            $this->nodes[] = $node;
        } else {
            $node->addParentNode( $resultArray->getPrintRequest()->getLabel(), $object->getShortWikiText() );
        }
    }
}

Fixed the problem for me. I'm wondering why this code was changed the way it did.

JeroenDeDauw commented 1 year ago

This code was already complex before, and now it is even more complex. I didn't catch the bug being introduced earlier and have little confidence I will spot bugs in this code, unless I stare at it for a week. Some simplification or unit testing would help. Ideally both.

alex-mashin commented 1 year ago

I've got the following usecase:

What was the query?

Also, this table seems to contain no wikilinked printouts, i.e., no properties of the type 'Page'. Therefore, the graph, representing it, will have no edges. Therefore, what is the point in displaying it as a graph?

YOUR1 commented 1 year ago

I've got the following usecase:

What was the query?

Also, this table seems to contain no wikilinked printouts, i.e., no properties of the type 'Page'. Therefore, the graph, representing it, will have no edges. Therefore, what is the point in displaying it as a graph?

The query is

{{#ask: [[Categorie:RDA Properties]] [[Source::Agent]][[Target::Work]]
 |?Label en
 |?SubPropertyOf.Label en=(Sub) Label en
 |format=graph
}}

With a older version of SRF (and GraphViz) I get: Gebruiker_WikiSysop_digraph_QueryResult_dot

With your changes I get: Diagrams_ab3c16d84c8671285beabf2b989b3f1f

This code was already complex before, and now it is even more complex. I didn't catch the bug being introduced earlier and have little confidence I will spot bugs in this code, unless I stare at it for a week. Some simplification or unit testing would help. Ideally both.

I agree, the old function (as attached above) is a lot easier to understand. To be honest, I really think we need to revert the old code in to the master again, to make sure things are still workings as they have been designed before. And yes, we need a lot more unit testing if we want to implement new features.

alex-mashin commented 1 year ago

I added the option 'graphfields', proposed by @YOUR1 in #747, to this pull request.

I also changed the code, so that property chains are treated as properties of the type 'Page', whether or not the last link is of that type. I suggest that @YOUR1 tests the patch in its current state on his example.

I agree with @JeroenDeDauw that this needs more unit testng. I have rewritten GraphFormatterTest.php to make it expandable, and added a test for a graph with non-page properties to be displayed as fields.

I disagree with @JeroenDeDauw, though, that GraphPrinter::processResultRow() is exceptionally complex — either in comparison with its previous version, or by this format's standards, or by the standards of Semantic Result Formats or the whole SMW family.

alex-mashin commented 1 year ago

@JeroenDeDauw?

JeroenDeDauw commented 1 year ago

Needs rebase

alex-mashin commented 1 year ago

Needs rebase

Haven't you got the option to merge and rebase?

alex-mashin commented 1 year ago

Needs rebase

Seems rebased now.

JeroenDeDauw commented 1 year ago

It is still not mergeable into master

image

alex-mashin commented 1 year ago

It is still not mergeable into master

image

This is what I see in this pull request: Mergeable

JeroenDeDauw commented 1 year ago

You need to rebase with master of this repo, not with master of your fork

alex-mashin commented 1 year ago

@JeroenDeDauw, is it mergeable now?

JeroenDeDauw commented 1 year ago

no

alex-mashin commented 1 year ago

I am abandoning this pull request in favour of a new one (#749), created from scratch.