clulab / eidos

Machine reading system for World Modelers
Apache License 2.0
36 stars 24 forks source link

Output groundings in a nicer format #1078

Closed kwalcock closed 2 years ago

kwalcock commented 2 years ago

In particular for EidosShell, webapp, and GroundFromText

kwalcock commented 2 years ago

This hopefully helps with testing the groundings. Suggestions for more informative output are welcome. This is more complicated than it might need to be because the traditional display without groundings is based on (Odin)Mentions, and grounding is only a part of EidosMentions, so lots of conversion was necessary.

Note that the EidosShell produces a file eidosshell.html that includes an html version of the output which might be easier/prettier to work with. The webapp also writes a text representation to the console, so with either program, both outputs are available.

Can someone, @zupon, remind me what this means: Is it possible to add the grounding log exporter to the webapp too? What is the "grounding log exporter"? It doesn't seem to be a class. Can it just as well be added to the EidosShell instead? A webapp usually doesn't include side effects of output to a file that is manually accessed, but it could.

zupon commented 2 years ago

I think that's referring to the cool stuff Becky added to the groundingInsightExporter. Things like including the specific examples from the matching nodes with their similarity scores, exact match details, and so forth, giving a more detailed picture than just the top grounding and score. Here's a simple example of the output:

EFFECT:

mention text: famine

mention entities: WrappedArray(O)   

===== Theme =====
--------GROUNDING INFO----------
  NODE: wm/concept/crisis_or_disaster/famine

     score: 1.0
     exact matches and regex:
         Exact Match: wm/concept/crisis_or_disaster/famine  (1.0f)

     Positive examples:
       num examples: 3
       examples with top score:
          --> famine    (0.9999998)
          --> starvation    (0.7397819)
          --> hunger    (0.6007418)
       max match: famine (0.9999998)
       min match: hunger (0.6007418)
       avg match: 0.7801745)
kwalcock commented 2 years ago

This seems to work by tracking down the grounder and asking it to do a little more investigation so that one has more insight into the situation. I've plugged directly into that exporter and had it add output to EidosShell and GroundFromText. The output is messy. Many of these things just scribble to System.out or to a string rather than produce a structure that might be formatted in different ways. Both the apps have a boolean to configure whether to include this information or not and I'd like to default it to false and have interested developers turn it on. Please let me know if this is adequate or should go through another round.

entities:
List(Concept, Entity) => Water trucking
        ------------------------------
        Rule => simple-np++Decrease_syntax_verb_agent
        Type => TextBoundMention
        ------------------------------
        Concept, Entity => Water trucking
         * Attachments: Decrease(decreased,None)
        ------------------------------
        THEME: wm/concept/goods/water (1.0)
        THEME PROCESS: wm/concept/infrastructure/transportation/road_infrastructure (0.7262313)
        ------------------------------
        mention text: Water trucking

        mention entities: WrappedArray(O, O)

        ===== Theme =====
        --------GROUNDING INFO----------
          NODE: wm/concept/goods/water

             score: 1.0
             exact matches and regex:

             Positive examples:
               num examples: 3
               examples with top score:
                  --> water     (0.57073784)
                  --> aqua      (0.22173807)
                  --> H20       (0.098248966)
               max match: water (0.57073784)
               min match: H20 (0.098248966)
               avg match: 0.2969083)

        ===== Theme Properties =====
        ===== Theme Process =====
        --------GROUNDING INFO----------
          NODE: wm/concept/infrastructure/transportation/road_infrastructure

             score: 0.7262313
             exact matches and regex:

             Positive examples:
               num examples: 4
               examples with top score:
                  --> trucking  (0.570738)
                  --> trucks    (0.4017349)
                  --> highway   (0.36385995)
                  --> road      (0.3293324)
               max match: trucking (0.570738)
               min match: road (0.3293324)
               avg match: 0.4164163)

        --------GROUNDING INFO----------
          NODE: wm/concept/infrastructure/transportation/

             score: 0.58448964
             exact matches and regex:

             Positive examples:
               num examples: 1
               examples with top score:
                  --> transportation    (0.47544187)
               max match: transportation (0.47544187)
               min match: transportation (0.47544187)
               avg match: 0.47544187)

        --------GROUNDING INFO----------
          NODE: wm/process/transportation/

             score: 0.58448964
             exact matches and regex:

             Positive examples:
               num examples: 1
               examples with top score:
                  --> transportation    (0.47544187)
               max match: transportation (0.47544187)
               min match: transportation (0.47544187)
               avg match: 0.47544187)

        ===== Theme Process Props =====
        ------------------------------

List(Concept, Entity) => cost of fuel
        ------------------------------
        Rule => gazetteer++simple-np
        Type => TextBoundMention
        ------------------------------
        Concept, Entity => cost of fuel
         * Attachments: Property(cost,None)
        ------------------------------
        THEME: wm/concept/goods/fuel (1.0)
        Theme properties: wm/property/price_or_cost (1.0)
        ------------------------------
        mention text: cost of fuel

        mention entities: WrappedArray(B-Property, O, O)

        ===== Theme =====
        --------GROUNDING INFO----------
          NODE: wm/concept/goods/fuel

             score: 1.0
             exact matches and regex:
                 Exact Match: wm/concept/goods/fuel     (1.0f)

             Positive examples:
               num examples: 3
               examples with top score:
                  --> fuel      (0.9999997)
                  --> gas       (0.76110333)
                  --> oil       (0.60306484)
               max match: fuel (0.9999997)
               min match: oil (0.60306484)
               avg match: 0.78805596)

        ===== Theme Properties =====
        --------GROUNDING INFO----------
          NODE: wm/property/price_or_cost

             score: 1.0
             exact matches and regex:
                 Exact Match: wm/concept/goods/fuel     (1.0f)

             Positive examples:
               num examples: 25
               examples with top score:
                  --> costs     (0.47039852)
                  --> costs     (0.47039852)
                  --> costs     (0.47039852)
                  --> costs     (0.47039852)
                  --> costs     (0.47039852)
               max match: costs (0.47039852)
               min match: recurrent (0.14967977)
               avg match: 0.380322)

        ===== Theme Process =====
        ===== Theme Process Props =====
        ------------------------------

events:
List(Causal, DirectedRelation, EntityLinker, Event) => Water trucking has decreased due to the cost of fuel
        ------------------------------
        Rule => dueToSyntax2-Causal
        Type => EventMention
        ------------------------------
        trigger => due
        effect (Concept, Entity) => Water trucking
          * Attachments: Decrease(decreased,None)
        cause (Concept, Entity) => cost of fuel
          * Attachments: Property(cost,None)
        ------------------------------

        ------------------------------
        mention text: Water trucking has decreased due to the cost of fuel

        mention entities: WrappedArray(O, O, O, O, O, O, O, B-Property, O, O)
        ------------------------------

==================================================
zupon commented 2 years ago

That should be helpful! If I wanted to trim out some of the output, could I just edit the specific exporter? I also like the idea of a toggle so it's not always on.

MihaiSurdeanu commented 2 years ago

This output is extremely useful. Thanks @kwalcock !

kwalcock commented 2 years ago

Yes, the exporter can be edited. One useful thing about all the saved commits is that they help us see where things need to be changed. That's half the battle.

I noticed a discrepancy in the output and need to clear that up before merging.