katesanders9 / multimodal-proofs

Code for multimodal neuro-symbolic proof generation for TV shows
1 stars 0 forks source link

Examples #6

Open katesanders9 opened 1 year ago

katesanders9 commented 1 year ago

Dialogue only

FULL TRANSCRIPT:

(Mrs. Rice:)Claire...
(Meredith:)you've had surgery recently.
(Meredith:)These scars are still pink.
(Claire:)Don't tell my parents.
(Meredith:)You did this in Mexico
so your parents wouldn't know?
(Meredith:)What did you have done?
[George] Ahh.
(George:)And you noticed her foot
twitching?
(George:)- Come on, Jamie.
- My foot.
(George:)Oh, yeah.
About three months ago?
(George:)Just a little.
(Mrs. Hayes:)We took her to County Hospital,
and she got the CT scan,
(Mrs. Hayes:)which showed her...
(George:)Brain abnormality.
Here, you can put it...
(George:)- And the twitching has gotten worse?
- A lot worse.
They don't have the proper equipment
back home to figure out what's wrong.
(George:)- You did a... you did a good thing.
- Show Mommy.
(George:)- Yes, your mommy.
- Show Mommy and Daddy.
(George:)You did a good thing
by coming all this way,
(George:)Mr. And Mrs. Hayes.
(George:)You sit tight, Jamie, OK?
(George:)I'm gonna bring
Dr. Shepherd to see you.
(George:)OK? Dr. Shepherd. OK?
(George:)He's the brain specialist.
[Jamie] Mommy and Daddy.
(Mrs. Hayes:)Doctor? Is he good,
this Dr. Shepherd?
(George:)At just about everything.
[Bailey] I assume
the lady needs her keys
(Bailey:)to leave this guy's
sorry ass behind.
(George:)- Yeah.
- Well, help her out.
(Izzie:)He needs a bronchoscopy.
(Bailey:)See one, do one, teach one.
(Bailey:)You've seen one.
It's time to do one.
(Izzie:)Alone? Seriously? Thanks.

HYPOTHESIS:

Claire had surgery in Mexico.

EVIDENCE:

(Mrs. Rice:)Claire...
(Meredith:)you've had surgery recently.
(Meredith:)These scars are still pink.
(Claire:)Don't tell my parents.
(Meredith:)You did this in Mexico
so your parents wouldn't know?
(Meredith:)What did you have done?

Step 1: Retrieved dialogue

Retrieval via CrossEncoder trained on MSMarco H1 generation via ChatGPT

(Mrs. Rice:)Claire...
(Meredith:)you've had surgery recently.

SCORE: 0.61 H1: Claire recently had surgery.

(Claire:)Don't tell my parents.
(Meredith:)You did this in Mexico so your parents wouldn't know?

SCORE: 0.56 H1: Claire did something in Mexico.

Step 2: Filter dialogue

Via CrossEncoder trained on SNLI and MultiNLI

H0: Claire had surgery in Mexico. H1: Claire recently had surgery. LABEL: NEITHER

H0: Claire had surgery in Mexico. H1: Claire did something in Mexico. LABEL: NEITHER

Step 3: Branching

Via ChatGPT

{
"1": ["Claire went to Mexico.", "Claire underwent surgery."],
"2": ["Claire had a medical procedure.", "The medical procedure took place in Mexico."]
}

Step 4: Dialogue retrieval

Retrieval via CrossEncoder trained on MSMarco H1 generation via ChatGPT

H0: Claire went to Mexico. Retrieved dialogue:

(Claire:)Don't tell my parents.
(Meredith:)You did this in Mexico so your parents wouldn't know?

SCORE: 3.05 H1: Claire did something in Mexico.

H0: Claire underwent surgery. Retrieved dialogue:

(Mrs. Rice:)Claire...
(Meredith:)you've had surgery recently.

SCORE: 4.00 H1: Claire recently had surgery.

Step 5: Filtering

Via CrossEncoder trained on SNLI and MultiNLI

H0: Claire went to Mexico. H1: Claire did something in Mexico. LABEL: ENTAILMENT

H0: Claire underwent surgery. H1: Claire recently had surgery. LABEL: ENTAILMENT

Full generated proof

H0: Claire had surgery in Mexico.
    > H1: Claire went to Mexico.
    > D1: (Claire:)Don't tell my parents.
          (Meredith:)You did this in Mexico so your parents wouldn't know?

    > H2: Claire underwent surgery.
    > D2: (Mrs. Rice:)Claire...
          (Meredith:)you've had surgery recently.
katesanders9 commented 1 year ago

More complex dialogue-only

Dialogue only

FULL TRANSCRIPT:

(House:)before Mistress Ilsa's rude interruption.
- That was probably a good thing, given...
- Start him on antibiotics and blood thinners.
(Cameron:)You still think Chase is wrong?
(House:)No, he's probably right.
(Chase:)Then we should schedule him
for vascular surgery.
(Chase:)Go into the carotids,
find the aneurysm, repair it.
(Cameron:)We put him on blood thinners,
he might bleed out.
(House:)But if Foreman's right about
it being bacterial endocarditis and we...
(Foreman:)I think Chase is right.
(House:)Okay. If Foreman used to be right
about it being blood clots...
(House:)and we take the surgery route,
we'll probably kill the guy.
(House:)So start him on blood thinners, and if he has
another stroke, we'll schedule the surgery.
(Foreman:)- Yeah?
- You Dr. House?
I have a feeling I'm gonna
regret this, but yes.
(Myron:)Ramona said I should come to you
for my refill.
Uh, you know, the...
The wood's beginning to droop.
(Myron:)She said you'd probably
want to talk to me first.
(House:)She lied.
(House:)Here you go.
Thanks.
(House:)Always happy to make people happy.
Good-bye.
Uh, I don't know how much more
of this I can take.
(House:)- Much more of what?

HYPOTHESIS:

Foreman rescinds his diagnosis.

EVIDENCE:

(Chase:)Then we should schedule him
for vascular surgery.
(Chase:)Go into the carotids,
find the aneurysm, repair it.
(Cameron:)We put him on blood thinners,
he might bleed out.
(House:)But if Foreman's right about
it being bacterial endocarditis and we...
(Foreman:)I think Chase is right.
(House:)Okay. If Foreman used to be right
about it being blood clots...
(House:)and we take the surgery route,
we'll probably kill the guy.

Step 1: Retrieved dialogue

Retrieval via CrossEncoder trained on MSMarco H1 generation via ChatGPT

No retrieved dialogue lines.

Step 2: Filter dialogue

Via CrossEncoder trained on SNLI and MultiNLI

Skip.

Step 3: Branching

Via ChatGPT Used slightly different prompting this time.

H1: Foreman initially diagnosed the patient's condition as bacterial endocarditis.
H2: Foreman rescinds his diagnosis and agrees with Chase that they should schedule the patient for vascular surgery.

Step 4: Dialogue retrieval

Retrieval via CrossEncoder trained on MSMarco H1 generation via ChatGPT

H0: Foreman initially diagnosed the patient's condition as bacterial endocarditis. Retrieved dialogue:

"(House:)But if Foreman's right about\nit being bacterial endocarditis and we..."

SCORE: 3.59 H1: Foreman suggested that the patient might have bacterial endocarditis.

H0: Foreman rescinds his diagnosis and agrees with Chase that they should schedule the patient for vascular surgery. Retrieved dialogue:

(Foreman:)I think Chase is right.

SCORE: 3.60 H1: Foreman agrees with Chase.

Step 5: Filtering

Via CrossEncoder trained on SNLI and MultiNLI

H0: Foreman initially diagnosed the patient's condition as bacterial endocarditis. H1: Foreman suggested that the patient might have bacterial endocarditis. LABEL: ENTAILMENT <-- But only backwards entailment.

H0: Foreman rescinds his diagnosis and agrees with Chase that they should schedule the patient for vascular surgery. H1: Foreman agrees with Chase. LABEL: ENTAILMENT <-- But only backwards entailment.

Full generated proof

H0: Foreman rescinds his diagnosis.
    > H1: Foreman initially diagnosed the patient's condition as bacterial endocarditis.
    > D1: (House:)But if Foreman's right about\nit being bacterial endocarditis and we...

    > H2: Foreman rescinds his diagnosis and agrees with Chase that they should schedule the patient for vascular surgery.
    > D2: (Foreman:)I think Chase is right.
katesanders9 commented 1 year ago

Generally, retrieval and entailment checks are going pretty well at this point. The cross encoders identify relevant dialogue and do NLI classifications on *parsed* dialogue-hypothesis pairs successfully.

The primary area that currently needs work is the hypothesis branching. The two proofs needed different branching prompts and the prompts produced for the second proof were too specific.

katesanders9 commented 1 year ago

D=0 (Leaf) examples

D=0 Pipeline

  1. Retrieve dialogue lines with MS MARCO cross encoder
    • Top 3 line chunks with score > 0
    • Segment lines in groups of 1-3, can be hyperparameter (?)
  2. Generate H1 candidates for each line with this prompt (can continue to improve this):
    
    Write five statements that are true given the provided dialogue. Write your answer in JSON format, i.e. {'1': '<answer here>', '2': '<answer here>', ...} and nothing else.

DIALOGUE: ...

STATEMENTS:

3. Pass (H0, H1) pairs into a cross encoder trained on SNLI and MultiNLI
4. Return (D1, H1) pairs that receive entailment judgments from the cross encoder as viable leaves

## Examples
**Note: Based on this example, it would be good to include an automatic "X said Y" H1 candidate in addition to the GPT-generated ones.**

HYPOTHESIS: `"Mike said Can we not talk about that now?"`
RETRIEVED DIALOGUE:

MIKE: Can we not talk about that now?

H1 CANDIDATES:

1: 'Mike expressed a desire to avoid discussing a certain topic at the present moment.', 2: 'There is a specific topic that Mike wants to avoid discussing.', 3: 'The conversation is being redirected away from a particular subject.', 4: 'Mike is not ready or willing to talk about something.', 5: 'The current timing is not suitable for discussing the mentioned topic.'

FILTERED CANDIDATES:

'Mike expressed a desire to avoid discussing a certain topic at the present moment.'

PROOF COMPLETE 
***
**Note: Based on the importance of knowing who is in conversations, I'm thinking of using `l=2` dialogue lines per chunk.**
**Note 2: Switch to http://www.livesinabox.com/friends/season7/718towja.htm for Friends transcripts**

HYPOTHESIS: `"That guy at the counter is checking out Monica."`
RETRIEVED DIALOGUE:

PHOEBE: That guy at the counter is totally checking you out! MONICA: Really?

H1 CANDIDATES:

'1': 'Phoebe commented on someone at the counter checking Monica out.', '2': 'Monica was surprised by Phoebe's observation.', '3': 'There is a guy standing at the counter.', '4': 'Monica was unsure about the truth of Phoebe's statement.', '5': 'Phoebe and Monica are having a conversation.'

FILTERED CANDIDATES:

'Phoebe commented on someone at the counter checking Monica out.'


PROOF COMPLETE
katesanders9 commented 1 year ago

Branching notes

katesanders9 commented 1 year ago

H1 generation for two dialogue chunks at once

Works for D=1

Write five hypotheses that a human would infer from the dialogue. Write your answer in JSON format, i.e. {'1': '<hypothesis 1>', '2': '<hypothesis 2>', ...} and nothing else.
DIALOGUE:
(Mrs. Rice:) Claire...
(Meredith:) You've had surgery recently.
...
(Claire:) Don't tell my parents.
(Meredith:) You did this in Mexico so your parents wouldn't know?

ChatGPT
{
'1': 'Claire had surgery recently.',
'2': 'Mrs. Rice knows about Claire's recent surgery.',
'3': 'Claire had surgery in Mexico.',
'4': 'Claire's parents are not aware of her surgery.',
'5': 'Claire wanted to keep her surgery in Mexico a secret from her parents.'
}
katesanders9 commented 1 year ago

Joint H1 generation failure mode

2 lines

Write five hypotheses that a human would infer from the dialogue. Write your answer in JSON format, i.e. {'1': '<hypothesis 1>', '2': '<hypothesis 2>', ...} and nothing else.
DIALOGUE:
(House:) But if Foreman's right about\nit being bacterial endocarditis and we...
...
(Foreman:) I think Chase is right.

ChatGPT
{
"1": "Foreman and House are discussing a medical case.",
"2": "Chase has a differing opinion from Foreman and House.",
"3": "The medical condition they are discussing is suspected to be bacterial endocarditis.",
"4": "Foreman's opinion aligns with Chase's on the medical case.",
"5": "House is expressing uncertainty about Foreman's diagnosis."
}

3 lines

Write five hypotheses that a human would infer from the dialogue. Write your answer in JSON format, i.e. {'1': '<hypothesis 1>', '2': '<hypothesis 2>', ...} and nothing else.
DIALOGUE:
(House:) But if Foreman's right about\nit being bacterial endocarditis and we...
...
(Foreman:) I think Chase is right.
...
(House:) Okay. If Foreman used to be right about it being blood clots...

ChatGPT
{
'1': 'Foreman used to disagree with Chase about the diagnosis being bacterial endocarditis.',
'2': 'Foreman now agrees with Chase about the diagnosis.',
'3': 'Foreman and Chase had a previous disagreement about the nature of the illness.',
'4': 'House is considering Foreman's previous accuracy in diagnosing blood clots.',
'5': 'The group is discussing different potential diagnoses for an unknown medical condition.'
}
katesanders9 commented 1 year ago

Branching: Syntactic decomposition

Works for many hypotheses, doesn't for some.

Break down the following sentence into two, single-clause sentences: "Claire had surgery in Mexico."

1. Claire had surgery.
2. The surgery took place in Mexico.
Break down the following sentence into two, single-clause sentences: "The headmaster suggests Castle and Beckett talk to Donny's friends when they're inquiring about Donny."

1. The headmaster suggests Castle and Beckett talk to Donny's friends.
2. Castle and Beckett are inquiring about Donny.

Failure mode:

Break down the following sentence into two, single-clause sentences: "Foreman rescinds his diagnosis."

1. Foreman rescinds.
2. His diagnosis.

Asking beforehand if the sentence can indeed be broken down does return correct judgments from ChatGPT for these sentences.

Another example, this time with syntactic branching

FULL TRANSCRIPT:

(Castle:)Only, you moved recently.
(Mrs Kendall:)Yes, a few months ago.
I was a partner at Lehman. And when it
went under, well, we took a big hit.
(Castle:)That can be pretty tough on a teenager.
(Mrs Kendall:)You get used to a certain way of living,
but it was no one's fault.
(Castle:)Do you think Donny was depressed?
No, Donny was strong.
But to lose everything you have...
(Mr Kendall:)We were just getting by.
(Beckett:)I'm very sorry for your loss.
(Beckett:)How did you know they moved?
(Castle:)Either the Kendalls really like art,
or that Chinese screen
(Castle:)was meant
for a much bigger apartment.
(Castle:)I wonder how they could
still afford Redding.
Do you know the school?
(Castle:)Oh, I've been kicked
out of all of New York's
(Castle:)finer educational institutions
at least once.
(Castle:)The irony is,
now that I'm rich and famous,
they all claim me as alum
and want money.
(Beckett:)It is just so rough being you.
(Castle:)My cross to bear.
(Headmaster:)When the Kendalls
said they couldn't afford tuition,
(Headmaster:)we put Donny on scholarship.
(Headmaster:)The family had been very generous
in the past,
(Headmaster:)and Donny was one of our brightest.
We thought he'd do great things.
(Castle:)Any idea what he would have been
doing at Central Park at night?
(Headmaster:)The truth is, all these kids are icebergs.
We only see the tip.
(Headmaster:)If you want to know the rest,
you should probably talk to his friends.
(Headmaster:)Amanda, Romy, Brandon, Spencer
and Max.
(Headmaster:)It's strange seeing them without Donny.
(Beckett:)Thank you.
Thanks.

HYPOTHESIS:

The headmaster suggests Castle and Beckett talk to Donny's friends when they're inquiring about Donny.

EVIDENCE:

(Headmaster:)and Donny was one of our brightest.
We thought he'd do great things.
(Castle:)Any idea what he would have been
doing at Central Park at night?
(Headmaster:)The truth is, all these kids are icebergs.
We only see the tip.
(Headmaster:)If you want to know the rest,
you should probably talk to his friends.
(Headmaster:)Amanda, Romy, Brandon, Spencer
and Max.

Step 1: Retrieved dialogue

Retrieval via CrossEncoder trained on MSMarco H1 generation via ChatGPT Sampled top 4 scores, 7 were >0

(Headmaster:)The family had been very generous in the past,
(Headmaster:)and Donny was one of our brightest. We thought he'd do great things.
'1': 'The family has been generous to the school in the past.',
'2': 'Donny was considered one of the brightest students at the school.',
'3': 'The school had high expectations for Donny's future success.',
'4': 'Donny was expected to achieve great things.',
'5': 'The headmaster expressed positive sentiments about Donny.'
(Headmaster:) and Donny was one of our brightest. We thought he'd do great things.
(Castle:) Any idea what he would have been doing at Central Park at night?"
'1': 'Donny was considered as one of the brightest students by the Headmaster.',
'2': 'The Headmaster had high expectations for Donny, believing he would achieve great things.',
'3': 'Donny is no longer around or has disappeared as indicated by the Castle's question.',
'4': 'Central Park is a location where Donny was found to be at night.',
'5': 'There is uncertainty about what Donny would have been doing at Central Park at night.'
(Headmaster:) Amanda, Romy, Brandon, Spencer and Max.
(Headmaster:) It's strange seeing them without Donny.
'1': 'The Headmaster knows at least five people: Amanda, Romy, Brandon, Spencer, and Max.',
'2': 'The person named Donny is not present in the group.',
'3': 'The group consists of five individuals.',
'4': 'Amanda, Romy, Brandon, Spencer, and Max are present at the moment.',
'5': 'The Headmaster finds it unusual that Donny is not with the group.'
(Headmaster:) It's strange seeing them without Donny.
(Beckett:) Thank you.
'1': 'The Headmaster mentioned someone named Donny.',
'2': 'Beckett expressed gratitude in response to the Headmaster's statement.',
'3': 'Donny is absent from the current situation.',
'4': 'The Headmaster and Beckett are having a conversation.',
'5': 'There is a sense of unfamiliarity in the situation due to Donny's absence.'

Step 2: Filter dialogue

Via CrossEncoder trained on SNLI and MultiNLI

Using threshold of -0.1 for E classification.

H0: The headmaster suggests Castle and Beckett talk to Donny's friends when they're inquiring about Donny. FILTERED DIALOGUE: NONE

Step 3: Branching

Via ChatGPT

1. The headmaster suggests Castle and Beckett talk to Donny's friends.
2. Castle and Beckett are inquiring about Donny.

Step 4: Dialogue retrieval

Retrieval via CrossEncoder trained on MSMarco H1 generation via ChatGPT

H0: The headmaster suggests Castle and Beckett talk to Donny's friends. Retrieved dialogue:

(Headmaster:) Amanda, Romy, Brandon, Spencer and Max.
(Headmaster:) It's strange seeing them without Donny.

Same H1s.

(Headmaster:) It's strange seeing them without Donny.
(Beckett:) Thank you.

Same H1s.

H0: Castle and Beckett are inquiring about Donny. Retrieved dialogue:

(Mrs Kendall:)You get used to a certain way of living, but it was no one's fault.
(Castle:)Do you think Donny was depressed?
'1': 'Mrs. Kendall lived a certain way of living, and she believes it was not anyone's fault.',
'2': 'Castle asked Mrs. Kendall if she thinks Donny was depressed.',
'3': 'Mrs. Kendall didn't blame anyone for the certain way of living they got used to.',
'4': 'Castle inquired about Donny's mental state, suggesting the possibility of depression.',
'5': 'The dialogue implies that Donny might have been dealing with depression.'
(Castle:)Do you think Donny was depressed?
No, Donny was strong. But to lose everything you have...
'1': 'Castle expressed concern about Donny's mental state.',
'2': 'The speaker believes Donny was not depressed.',
'3': 'Castle mentioned that Donny was strong.',
'4': 'Donny lost everything he had.',
'5': 'The speaker acknowledged the difficulty of losing everything.'
(Headmaster:) and Donny was one of our brightest. We thought he'd do great things.
(Castle:) Any idea what he would have been doing at Central Park at night?"

Same H1s.

(Headmaster:) It's strange seeing them without Donny.
(Beckett:) Thank you.

Same H1s.

Step 5: Filtering

Via CrossEncoder trained on SNLI and MultiNLI

H0: Castle and Beckett are inquiring about Donny. FILTERED H1s:

"Castle asked Mrs. Kendall if she thinks Donny was depressed."
"Castle expressed concern about Donny's mental state."

H0: The headmaster suggests Castle and Beckett talk to Donny's friends. FILTERED H1s: None.

Cheating: Backtracking and setting retrieval threshold to -0.2

Extra dialogue:

(Headmaster:) If you want to know the rest, you should probably talk to his friends.
(Headmaster:) Amanda, Romy, Brandon, Spencer and Max.

H1s:

'1': 'The headmaster mentioned someone named Amanda.',
'2': 'The headmaster mentioned someone named Romy.',
'3': 'The headmaster mentioned someone named Brandon.',
'4': 'The headmaster mentioned someone named Spencer.',
'5': 'The headmaster mentioned someone named Max.'

Added the automatic X said Y: "The Headmaster said if you want to know the rest, you should probably talk to his friends."

FILTERED H1s: Still none

katesanders9 commented 1 year ago

Notes

Based on the previous example,

  1. Various hyperparameters can be tuned for the pipeline.
  2. For semantically complex dialogue passages, setting the number of generated hypotheses to >5 might be necessary.
  3. Retrieving too many dialogue passages might not be too bad (better too many than too few, except for the small GPT cost).
  4. Generating H1s and running them against a cross encoder may be too stringent, even with the extra score cutoff margin.
  5. Coreference issues are probably the main issue with the second dialogue branch in the previous example.
katesanders9 commented 1 year ago

Trying one more

HYPOTHESIS: Ted felt happy when Robin asked Ted to be best man. EVIDENCE:

(Robin:)Ted, will you be my best man?
(Robin:)Scherbatsky...
(Ted:)I'm gonna crush it.
(Barney:)Hi, there.

Dialogue retrieval

Taking top 4

'(Ted:)Big mistake, but go on.\n(Robin:)I could really use someone like you.'
"(Ted:)Are you asking what I\nthink you're asking?\n(Robin:)Ted, will you be my best man?"
'(Robin:)Ted, will you be my best man?\n(Robin:)Scherbatsky...'
(Robin:)When we all needed it,\nyou got us back on the right path.\n(Ted:)It's what I do.

Filtering

None

Branching

1. Ted felt happy. 
2. Robin asked Ted to be best man.

Retrieval

H0: Ted felt happy.

(Robin:)When we all needed it,\nyou got us back on the right path.\n(Ted:)It's what I do.
(Ted:)Big mistake, but go on.\n(Robin:)I could really use someone like you.
(Robin:)Ted, will you be my best man?\n(Robin:)Scherbatsky...
(Ted:)I'm gonna crush it.\n(Barney:)Hi, there.

H0: Robin asked Ted to be best man.

"(Ted:)It's what I do.\n(Robin:)If I ever get married\nand you're not the guy I'm marrying...")
'(Ted:)Big mistake, but go on.\n(Robin:)I could really use someone like you.')
(Ted:)Are you asking what I\nthink you're asking?\n(Robin:)Ted, will you be my best man?")
(Robin:)Ted, will you be my best man?\n(Robin:)Scherbatsky...')

Filtering

H0: Ted felt happy. H1: Ted appears to be determined and enthusiastic.

H0: Robin asked Ted to be best man. H1: Robin asked Ted to be best man.

wow we got one

H0: Ted felt happy when Robin asked Ted to be best man.
    > H1: Robin asked Ted to be best man.
    > D1: (Ted:)Are you asking what I think you're asking?
          (Robin:)Ted, will you be my best man?

    > H2: Ted appears to be determined and enthusiastic.
    > D2: (Ted:)I'm gonna crush it.
          (Barney:)Hi, there.