nehanims / notes

Backend for voice-notes
0 stars 0 forks source link

Metric generation quality improvement ideas #35

Open nehanims opened 2 months ago

nehanims commented 2 months ago

Here's a few ideas for metric quality improvement

nehanims commented 2 months ago
nehanims commented 1 month ago

I'm collecting certain metrics for cerrtain events in natural language form. I would like to figure out if there is a potentially causal link between any metrics. My idea is to create a knowledge graph that has the events as nodes and the time difference between them as relations (only for some specific time window, say events that occur within 24 hours of each other would have a link in tthe KG) and then try to see if an event or a cluster of events typically occurs before another.

Potential approaches to explore:

Analyzing Event Causality Using Knowledge Graphs

Understanding the Problem: You're looking to identify potential causal relationships between events based on their temporal proximity. Knowledge graphs provide a suitable framework for this task.

Proposed Approach:

  1. Event Extraction:

    • Natural Language Processing (NLP): Use techniques like named entity recognition and event extraction to identify relevant events from your textual data.
    • Regular Expressions: For structured data, regular expressions can be effective.
  2. Knowledge Graph Construction:

    • Nodes: Represent each extracted event as a node.
    • Edges: Connect events with edges if they occur within the specified time window (e.g., 24 hours). The edge weight can represent the time difference.
  3. Causality Analysis:

    • Path Analysis:
      • Shortest Path: Identify the shortest path between events. A shorter path might suggest a stronger causal relationship.
      • Frequent Path Mining: Find frequently occurring paths, indicating common patterns.
    • Temporal Graph Analysis:
      • Temporal Point Processes: Model the occurrence of events as points in time and analyze their dependencies.
      • Event Sequences: Identify common sequences of events.
  4. Causal Inference:

    • Structural Equation Modeling (SEM): If you have additional variables or assumptions about the causal relationships, SEM can be used to estimate causal effects.
    • Bayesian Networks: These probabilistic graphical models can represent causal relationships and allow for inference.

Additional Considerations:

Tools and Libraries:

Example: If you're analyzing customer support interactions, you might identify events like "product inquiry," "complaint," and "refund request." By analyzing the temporal relationships between these events, you might discover that complaints often precede refund requests.

By following these steps and considering the additional factors, you can effectively use knowledge graphs to analyze potential causal relationships between events in your natural language data.

Would you like to delve deeper into any specific aspect of this approach, or do you have other questions?