Metric generation quality improvement ideas

nehanims commented 2 months ago

Here's a few ideas for metric quality improvement

Add more examples for input and expected output
Break the prompt down into simpler prompts for different types of entities and generate each type of metric in a separate prompt
Allow the user to retry generating metrics from the same recording (don't get rid of the old metrics, just add them to the list)
Allow the user to edit the transcript to correct mistakes in transcription and then regenerate metrics
Allow the user to change the LLM and run the retry of the same recording
Show the users the metrics generated from the rewrite
Send the texts to multiple LLMs and combine the metrics generated by multiple LLMs

nehanims commented 2 months ago

Change type to category
Add field description and add prompt to provide a brief description and then a 1-2 line summary
Get summary from another model and/or prompt the model to generate a one or two line summary

nehanims commented 1 month ago

I'm collecting certain metrics for cerrtain events in natural language form. I would like to figure out if there is a potentially causal link between any metrics. My idea is to create a knowledge graph that has the events as nodes and the time difference between them as relations (only for some specific time window, say events that occur within 24 hours of each other would have a link in tthe KG) and then try to see if an event or a cluster of events typically occurs before another.

Potential approaches to explore:

Analyzing Event Causality Using Knowledge Graphs

Understanding the Problem: You're looking to identify potential causal relationships between events based on their temporal proximity. Knowledge graphs provide a suitable framework for this task.

Proposed Approach:

Event Extraction:
- Natural Language Processing (NLP): Use techniques like named entity recognition and event extraction to identify relevant events from your textual data.
- Regular Expressions: For structured data, regular expressions can be effective.
Knowledge Graph Construction:
- Nodes: Represent each extracted event as a node.
- Edges: Connect events with edges if they occur within the specified time window (e.g., 24 hours). The edge weight can represent the time difference.
Causality Analysis:
- Path Analysis:
  - Shortest Path: Identify the shortest path between events. A shorter path might suggest a stronger causal relationship.
  - Frequent Path Mining: Find frequently occurring paths, indicating common patterns.
- Temporal Graph Analysis:
  - Temporal Point Processes: Model the occurrence of events as points in time and analyze their dependencies.
  - Event Sequences: Identify common sequences of events.
Causal Inference:
- Structural Equation Modeling (SEM): If you have additional variables or assumptions about the causal relationships, SEM can be used to estimate causal effects.
- Bayesian Networks: These probabilistic graphical models can represent causal relationships and allow for inference.

Additional Considerations:

Confounding Factors: Be aware of confounding factors that might influence the observed relationships.
Data Quality: Ensure the quality of your extracted events and the accuracy of the temporal information.
Domain Knowledge: Incorporate domain-specific knowledge to improve the analysis.
Visualization: Use visualization techniques to explore the knowledge graph and identify patterns.

Tools and Libraries:

Knowledge Graph Libraries: Neo4j, RDF4J, Stardog
NLP Libraries: NLTK, spaCy, Gensim
Graph Analysis Libraries: NetworkX, GraphFrames

Example: If you're analyzing customer support interactions, you might identify events like "product inquiry," "complaint," and "refund request." By analyzing the temporal relationships between these events, you might discover that complaints often precede refund requests.

By following these steps and considering the additional factors, you can effectively use knowledge graphs to analyze potential causal relationships between events in your natural language data.

Would you like to delve deeper into any specific aspect of this approach, or do you have other questions?

nehanims / notes

Metric generation quality improvement ideas #35

Analyzing Event Causality Using Knowledge Graphs