anuj-ti / kb-issue-ref

0 stars 0 forks source link

Ref- Broken json [edges without target field] #1

Open anuj-ti opened 1 year ago

anuj-ti commented 1 year ago

Problem: Broken json's with missing target generated by edges prompt

This ref is part of this issue: issue

2 out of 106 sub kg's generated had this issue.

1)

"Document ID": "19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM_6"

Chunk details:

{
            "id": "19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM_6",
            "text": "a|\n=== Teardown Summary\n\nWhat is this teardown addressing about the product?\n\nXxxxx Work Unit Tasks Table is still present for Exec Review (following\nfinal assembly).\n\n[Need to re-add the Work Unit Tasks Table during the Scope Task\nfollowing a spec being rejected at Exec Review].\n\n| |\n|*Introduction* | |\n\na|\nXxxxx Leaves blank: introduction\n\n[Provide a succinct (one paragraph) background about this teardown and\nhow it originated. Some examples:\n\n* {blank}\n+\n____\nOur DevFlows platform is an iPaaS solution and we’re interested in\nlearning from competitive products. This spec tears down Zapier to give\nus ideas of how and what we should improve in DevFlows.\n____\n* {blank}\n+\n____\nXant.ai is a SaaS software company that ESW are evaluating for\nacquisition, and would be incorporated into the Aurea business unit.\n____\n* {blank}\n+\n____\nAWS Kendra is an important AWS service and we are writing a teardown as\na way to prompt the AWS Product Team to answer some questions we have\nand address limitations we have observed.\n____\n\nAlso can include links to other specs that the EVP wants the work unit\nauthors and the P2 author to be aware of.\n\n]\n\n| |\n\n|*Questions That Are Answered* | |\n\na|\nXxxxx Leaves blank: Questions answered\n\nXxxxx Fails to establish the questions that should be answered by this\nspec as a clear list of critical questions.\n\nXxxxx List of questions has more than 10, and/or compound questions are\nused.\n\n[ A high level (5-7) list of questions that will be answered by this\nspec. This sets scope for the work units and the P2 author. For example:\n\nZapier teardown comparing to Devflows\n\n* {blank}\n+\n____\nWhat constraints does Zapier place on flow developers to keep their\nproduct easy to use?\n____\n* {blank}\n+\n____\nHow has Zapier built such a large list of 3rd party integrations? Is\nthere techdiff there we can learn from?\n____\n* {blank}\n+\n____\nWhat capabilities make flow development easy for citizen integrators?\n____\n* {blank}\n+\n____\nWhat “language” features does the Zapier flow environment support.\n____\n\nXant.ai pre-acquisition:\n\n* {blank}\n+\n____\nThe incoming survey shows that Xant.ai has a large telephony component.\nHow hard would it be to move this to Twilio or AWS Chime?\n____\n* {blank}\n+\n____\nWhat are the major seams in the product and how could we use these to\nre-platform?\n____\n* {blank}\n+\n____\nHow cloud native, or otherwise, is this product currently?\n____\n\n]\n\n| |\n\n|*Inputs* | |\n\na|\nXxxxx List of inputs is missing.\n\nXxxxx Proposes conducting work units for which inputs are missing.\n\n[ Quick check whether survey, source code links, data schemas and PCA\ndocuments (for Trilogy internal products) are available. Indicate\nwhether the software seams and data structures work unit should be\nattempted based on whether the inputs exist (doesn’t check the quality\nof them at this point).\n\n]\n\n| |\n\n|*Insights to Spec Contributors* | |\n\na|\nXxxxx Provide insight to spec contributors.\n\n**[**Provide guidance that is useful for spec contributions, such as:\n\n* {blank}\n+\n____\nWhether the whiteboard diagram is most likely to be an architecture\ndocument, an ERD, an AWS smorgasbord diagram, a process diagram a\nflowchart, etc.\n____\n* {blank}\n+\n____\nWhat you are curious about in the APIs. Is it the actions? Or the data\nrelationships (hint: ERD)?\n____\n* {blank}\n+\n____\nWhere you want the core focus to be.\n____\n* {blank}\n+\n____\nWho is the customer (as in “customer CIVs”)? Is there more than one\ncustomer (e.g. in a B2B2C system)? What is the product? (Sometimes, the\nproduct that is sold to customers may be different to the scope of the\nteardown).\n____\n\n]\n\n| |",
            "metadata": {
                "source": "https://docs.google.com/document/d/1lw5CdLy49C6_eog2LpQBsmrcFRuFsmsBzOvz08oL6pQ",
                "originalModifiedTime": "2022-04-21T09:21:28.683Z",
                "parents": "llm-kb-dataset",
                "name": "ascii/19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM.json",
                "originalCreatedTime": "2022-04-21T09:21:13.865Z",
                "mimeType": "application/json",
                "originalOwnersEmail": "spec.automation@devfactory.com",
                "kgNodeId": null,
                "originalOwnersName": "Spec Automation",
                "originalFileSource": "https://docs.google.com/document/d/1lw5CdLy49C6_eog2LpQBsmrcFRuFsmsBzOvz08oL6pQ"
            }
        }

Output

{
  "id": "19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM_6",
  "kg": {
    "nodes": [
      {
        "id": "1.15 Pre-mortem",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to pre-mortem analysis"
      },
      {
        "id": "1.16 All authors decision-supporting data",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to decision-supporting data"
      },
      {
        "id": "1.17 A spec statement",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to spec statements"
      },
      {
        "id": "1.18 A part of the document",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to document parts"
      },
      {
        "id": "1.19 Undefined component name",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to undefined component names"
      },
      {
        "id": "1.20 There is no ITD or IF",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to missing ITD/IF"
      },
      {
        "id": "1.21 Ambiguous or indeterminate quantifier",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to ambiguous or indeterminate quantifiers"
      },
      {
        "id": "1.22 Authors rationale includes business, legal or cost reasons",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to author's rationale"
      },
      {
        "id": "1.23 Author resolved QC, SVP or Exec team comment",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to author's resolved comments"
      },
      {
        "id": "1.24 No Section for a Core Function or an Important Topic",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to missing sections"
      },
      {
        "id": "Scope Work Unit",
        "type": "concept",
        "dbpedia": null,
        "desc": "A concept related to scope work units"
      }
    ],
    "edges": [
      {
        "source": "1.15 Pre-mortem",
        "type": "FILLED_IN_BY",
        "target": "https://docs.google.com/spreadsheets/d/1IF8YGAKKMPrwQMP0aWMDTkkbcT44JdauRfY1-WoEj_U/edit#gid=0[[.underline]#pre-mortems spreadsheet#]"
      },
      {
        "source": "1.16 All author’s decision-supporting data",
        "type": "LINKED_TO",
        "target": "its source"
      },
      {
        "source": "1.17 A spec statement",
        "type": "FACTUALLY_INCORRECT",
        "target": "Amazon Redshift uses Oracle as its underlying engine"
      },
      {
        "source": "1.18 A part of the document",
        "type": "CONTRADICTS",
        "target": "choosing one option in an ITD, then writing milestones for a different one"
      },
      {
        "source": "1.19 Undefined component name",
        "type": "UNDEFINED"
      },
      {
        "source": "1.20 There is no ITD/IF",
        "type": "EXPLAINING_TECH_CHOICE"
      },
      {
        "source": "1.21 Ambiguous or indeterminate quantifier",
        "type": "AMBIGUOUS_OR_INDETERMINATE"
      },
      {
        "source": "1.22 Author’s rationale includes business, legal or cost reasons",
        "type": "INCLUDES"
      },
      {
        "source": "1.23 Author resolved QC, SVP or Exec team comment",
        "type": "RESOLVED"
      },
      {
        "source": "1.24 No Section for a Core Function or an Important Topic",
        "type": "NO_SECTION_FOR"
      },
      {
        "source": "Scope Work Unit",
        "type": "SCOPE_WORK_UNIT"
      }
    ]
  },
  "metadata": {
    "name": "ascii/19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM.json",
    "parents": "llm-kb-dataset",
    "mimeType": "application/json",
    "originalFileSource": "https://docs.google.com/document/d/1lw5CdLy49C6_eog2LpQBsmrcFRuFsmsBzOvz08oL6pQ",
    "originalOwnersName": "Spec Automation",
    "originalModifiedTime": "2022-04-21T09:21:28.683Z",
    "originalOwnersEmail": "spec.automation@devfactory.com",
    "originalCreatedTime": "2022-04-21T09:21:13.865Z",
    "kgNodeId": null
  }
}

2)

"Document ID": "19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM_7"

Chunk Details

{
            "id": "19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM_7",
            "text": "* {blank}\n+\n____\nWhether the whiteboard diagram is most likely to be an architecture\ndocument, an ERD, an AWS smorgasbord diagram, a process diagram a\nflowchart, etc.\n____\n* {blank}\n+\n____\nWhat you are curious about in the APIs. Is it the actions? Or the data\nrelationships (hint: ERD)?\n____\n* {blank}\n+\n____\nWhere you want the core focus to be.\n____\n* {blank}\n+\n____\nWho is the customer (as in “customer CIVs”)? Is there more than one\ncustomer (e.g. in a B2B2C system)? What is the product? (Sometimes, the\nproduct that is sold to customers may be different to the scope of the\nteardown).\n____\n\n]\n\n| |\n\n|*1-sentence Goal* | |\n\na|\nxxxxx Leaves blank: 1-sentence goal of the spec\n\nxxxxx-1-sentence goal is longer than 2 sentences\n\n[\n\n* {blank}\n+\n____\nKeep this as short and simple as possible.\n____\n** {blank}\n+\n____\n1 sentence is the target. But if you really need two that is better than\na single hard-to-read multi-clause sentence.\n____\n** {blank}\n+\n____\nIt may well be imprecise - you will explain which more detailed items\nare in and out of scope below.\n____\n** {blank}\n+\n____\nE.g.\n____\n*** {blank}\n+\n____\n“Identify what we could learn from Zapier to improve DevFlows”\n____\n*** {blank}\n+\n____\n“Assess the architecture of Xant.ai for the short term lift & shift\ncost, and the long term opportunity to replatform”.\n____\n* {blank}\n+\n____\nYou’re looking for a simple statement that summarizes what this spec is\nall about.\n____\n** {blank}\n+\n____\nIt should be something you can use as a coarse-grained filter to decide\nif more detailed items belong in the scope of this spec or not.\n____\n** {blank}\n+\n____\nIf you find yourself needing to list multiple disconnected things that\ndo not have a simple common high-level purpose then the scope is\nprobably wrong, even if that’s what the customer asked for.\n____\n\n]\n\n| |",
            "metadata": {
                "source": "https://docs.google.com/document/d/1lw5CdLy49C6_eog2LpQBsmrcFRuFsmsBzOvz08oL6pQ",
                "parents": "llm-kb-dataset",
                "originalModifiedTime": "2022-04-21T09:21:28.683Z",
                "name": "ascii/19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM.json",
                "originalCreatedTime": "2022-04-21T09:21:13.865Z",
                "mimeType": "application/json",
                "originalOwnersEmail": "spec.automation@devfactory.com",
                "kgNodeId": null,
                "originalOwnersName": "Spec Automation",
                "originalFileSource": "https://docs.google.com/document/d/1lw5CdLy49C6_eog2LpQBsmrcFRuFsmsBzOvz08oL6pQ"
            }
        }

Output

{
  "id": "19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM_7",
  "kg": {
    "nodes": [
      {
        "id": "Xxxxx Work Unit Tasks Table",
        "type": "software",
        "dbpedia": null,
        "desc": "A work unit tasks table for Exec Review"
      },
      {
        "id": "DevFlows",
        "type": "software",
        "dbpedia": null,
        "desc": "An iPaaS solution"
      },
      {
        "id": "Zapier",
        "type": "software",
        "dbpedia": "https://dbpedia.org/resource/Zapier",
        "desc": "A product for flow development"
      },
      {
        "id": "Xant.ai",
        "type": "software",
        "dbpedia": null,
        "desc": "A SaaS software company"
      },
      {
        "id": "AWS Kendra",
        "type": "software",
        "dbpedia": "https://dbpedia.org/resource/AWS_Kendra",
        "desc": "An important AWS service"
      },
      {
        "id": "Twilio",
        "type": "software",
        "dbpedia": "https://dbpedia.org/resource/Twilio",
        "desc": "A telephony component"
      },
      {
        "id": "AWS Chime",
        "type": "software",
        "dbpedia": "https://dbpedia.org/resource/Amazon_Chime",
        "desc": "A telephony component"
      }
    ],
    "edges": [
      {
        "source": "Xxxxx Work Unit Tasks Table",
        "type": "PRESENT_FOR",
        "target": "Exec Review"
      },
      {
        "source": "Xxxxx Work Unit Tasks Table",
        "type": "RE-ADD_DURING",
        "target": "Scope Task"
      },
      {
        "source": "Xxxxx Work Unit Tasks Table",
        "type": "REJECTED_AT",
        "target": "Exec Review"
      },
      {
        "source": "DevFlows",
        "type": "TEARDOWN_OF",
        "target": "Zapier"
      },
      {
        "source": "DevFlows",
        "type": "INTERESTED_IN",
        "target": "competitive products"
      },
      {
        "source": "DevFlows",
        "type": "IMPROVE_IN",
        "target": "DevFlows"
      },
      {
        "source": "Xant.ai",
        "type": "EVALUATING_FOR",
        "target": "acquisition"
      },
      {
        "source": "Xant.ai",
        "type": "INCORPORATED_INTO",
        "target": "Aurea business unit"
      },
      {
        "source": "AWS Kendra",
        "type": "IMPORTANT_AWS_SERVICE"
      },
      {
        "source": "AWS Kendra",
        "type": "TEARDOWN_AS_PROMPT"
      },
      {
        "source": "AWS Kendra",
        "type": "ANSWER_QUESTIONS"
      },
      {
        "source": "AWS Kendra",
        "type": "ADDRESS_LIMITATIONS"
      },
      {
        "source": "Xxxxx Work Unit Tasks Table",
        "type": "LINK_TO",
        "target": "other specs"
      },
      {
        "source": "Xxxxx Work Unit Tasks Table",
        "type": "LINK_TO",
        "target": "EVP"
      },
      {
        "source": "Xxxxx Leaves blank",
        "type": "INTRODUCTION"
      },
      {
        "source": "Xxxxx Leaves blank",
        "type": "BACKGROUND"
      },
      {
        "source": "Xxxxx Leaves blank",
        "type": "ORIGINATED_FROM"
      },
      {
        "source": "Xxxxx Fails to establish",
        "type": "QUESTIONS_ANSWERED"
      },
      {
        "source": "Xxxxx List of questions",
        "type": "QUESTIONS_ANSWERED"
      },
      {
        "source": "Xxxxx List of questions",
        "type": "SCOPE_FOR"
      },
      {
        "source": "Xxxxx List of questions",
        "type": "ZAPIER_TEARDOWN_COMPARING_TO"
      },
      {
        "source": "Xxxxx List of questions",
        "type": "CONSTRAINTS_ON_FLOW_DEVELOPERS"
      },
      {
        "source": "Xxxxx List of questions",
        "type": "CHOSEN_COMPONENT"
      },
      {
        "source": "Twilio",
        "type": "PART_OF",
        "target": "AWS Chime"
      }
    ]
  },
  "metadata": {
    "name": "ascii/19ODQOv5seK4n4xOONHpdAsz7dPbSIQFSuoRJ-nOCiAM.json",
    "parents": "llm-kb-dataset",
    "mimeType": "application/json",
    "originalFileSource": "https://docs.google.com/document/d/1lw5CdLy49C6_eog2LpQBsmrcFRuFsmsBzOvz08oL6pQ",
    "originalOwnersName": "Spec Automation",
    "originalModifiedTime": "2022-04-21T09:21:28.683Z",
    "originalOwnersEmail": "spec.automation@devfactory.com",
    "originalCreatedTime": "2022-04-21T09:21:13.865Z",
    "kgNodeId": null
  }
}