aolney / mofacts-automated-authoring

Automated content creation for MoFaCTS
Apache License 2.0
3 stars 1 forks source link

Map pronouns to their coreferents when generating items #16

Closed aolney closed 4 years ago

aolney commented 4 years ago

Linked to https://github.com/memphis-iis/mofacts-ies/issues/148

imrryr commented 4 years ago

OK, so we have 2 issue trackers related to the project... Not clear how to handle that, discuss at meeting

imrryr commented 4 years ago

So this one is ambiguous to me... I thought the goal was to do this in dialogue.... is this a typo? You imply this is a change in the stim file...

imrryr commented 4 years ago

If it is what I thought it was, a change to dialogue module, the 7/22 date is most appropriate. We are not changing stim files for Koushik in mid semester, so if it is a change in stim files it is for Fall...

aolney commented 4 years ago

It's a change to the stims - the information about referents is lost when stims are produced and so is no longer available when the dialogue is generated.

imrryr commented 4 years ago

OK, good we are communicating more about it then. Lets not switch out the stims until mid August. Actually, though, I just checked Koushik's usage, and there is none so far. :(

aolney commented 4 years ago

@imrryr I've got something working, but it has errors. Here are our options:

Below is some sample output with full context. A replacement is indicated with ( original | replacement ).

Sample performance summary:

Sample performance annotated

"Obtaining oxygen and removing carbon dioxide are the primary functions of the respiratory system .", "(This system|the respiratory system) includes tubes that remove particles from incoming air and transport air into and out of the lungs , as well as microscopic air sacs where gases are exchanged .",

Correct

"(The respiratory system|This system) uses skeletal muscles , which are under voluntary control , but unless you are blowing up party balloons or playing a wind instrument , you may not be aware of the air moving in and out of your lungs .",

No op. We would not change an NP unless it began with this/that/these/those

"Parts of the brainstem control breathing automatically , constantly bringing in oxygen to support aerobic production of ATP and constantly eliminating carbon dioxide .",
"(The respiratory system|The respiratory system) produces vocal sounds , participates in the sense of smell , and plays a role in the regulation of blood (pH.pH.|ATP)",

No op

"(The respiratory system|The respiratory system) consists of passages that filter , moisten , and warm incoming air and transport (it|The respiratory system) into the body , into the lungs , and to the many microscopic air sacs where gases are exchanged .",

No op

Error: should be "air" not "the respiratory system"

"The entire process of exchanging gases between the atmosphere and body cells is called respiration .",
"(It|the atmosphere and body cells) consists of several events : movement of air in and out of the lungs , commonly called breathing , or ventilation exchange of gases between the air in the lungs and the blood , called external respiration transport of gases by the blood between the lungs and body cells exchange of gases between the blood and the body cells , also called internal respiration oxygen O2 use and production of carbon dioxide (CO2|ATP) by body cells as part of the process of cellular respiration (Respiration|ATP) occurs on a macroscopic level -- (it|ATP) is a function of an organ system .",

Error: should be "respiration" not "the atmosphere and body cells"

No op

No op

Error: should be "respiration" not "ATP"

NOTE: this sentence probably has HTML parsing errors

"However , the reason that body cells must exchange gases -- take up oxygen and release carbon dioxide -- is apparent at the cellular and molecular levels .",
"Cellular respiration enables cells to harness energy held in the chemical bonds of nutrient molecules .",
"In aerobic reactions , cells liberate energy from (these molecules|the chemical bonds of nutrient molecules) by removing electrons and channeling them through a series of carriers called the electron transport chain , yielding (ATP|Respiration) .",

No op

"At the end of (this chain|the electron transport chain , yielding ATP) , electrons bind oxygen atoms and hydrogen ions to produce water molecules .",

Correct, if you can accept ", yielding ATP" at the end

"Without oxygen , these reactions cease .",
"The aerobic reactions also produce (CO2.CO2|ATP) .",

No op

"(CO2CO2|CO2.CO2) combines with water to form carbonic acid , helping to maintain blood pH. Too much (CO2,CO2|CO2.CO2) , however , lowers blood (pH|CO2.CO2) , compromising homeostasis .",

No op x 3

"(The respiratory system|The respiratory system) both provides oxygen for aerobic reactions and eliminates (CO2CO2|pH) at the appropriate rate to maintain the pH of the internal environment .",

No op x 2

"The organs of (the respiratory system|The respiratory system) can be divided into two groups , or tracts .",

No op

"The upper respiratory tract includes the nose , nasal cavity , sinuses , pharynx , and larynx .",
"The lower respiratory tract includes the trachea , bronchial tree , and lungs .",
"The respiratory structures that air passes through face the outside environment , and except for the parts where exchange of gases takes place , they are lined with mucous membrane .",
"The nose is covered with skin and is supported internally by muscle , bone , and cartilage .",
"(Its|The nose) two nostrils provide openings through which air can enter and leave (the nasal cavity|the trachea , bronchial tree) .",

Correct if you can accept lack of possessive case. That is something I could return to after concept maps, etc

"Many internal hairs in these openings prevent entry of large particles carried in the air .",
"The nasal cavity , a hollow space behind the nose , is divided medially into right and left portions by (the nasal septum|the nasal cavity) .",

No op

"(This cavity|the nasal septum) is separated from (the cranial cavity|the nasal septum) by the cribriform plate of (the ethmoid bone|the nasal septum) and from (the oral cavity|the nasal septum) by the hard palate .",

Correct

No op x 3

"(The nasal septum|the oral cavity) may bend during birth or shortly before adolescence .",

No op

"Such a deviated septum may obstruct (the nasal cavity|The nasal septum) , making breathing difficult .",

No op

"As figure 19.2 shows , nasal conchae curl out from the lateral walls of (the nasal cavity|the nasal cavity) on each side , forming passageways called the superior , middle , and inferior meatuses .",

No op

"(The nasal chonchae|the nasal cavity) support (the mucous membrane that lines the nasal cavity|the cribriform plate of the ethmoid bone) (the nasal cavity|the nasal cavity) .",

No op x 3

"(The conchae|the nasal cavity) also help increase (the mucous membrane 's|the nasal cavity) surface area .",

No op x 2

"The upper posterior portion of (the nasal cavity , below the cribriform plate|the mucous membrane that lines the nasal cavity) (the cribriform plate|the mucous membrane that lines the nasal cavity) , is slitlike , and (its|the mucous membrane 's) lining contains the olfactory receptors that provide the sense of smell .",

No op x 2

**Error: should be "that nasal cavity" not "the mucous membrane 's"

"The remainder of (the cavity|the mucous membrane 's) conducts air to and from (the nasopharynx|the mucous membrane 's) .",

No op x 2

"(The mucous membrane lining the nasal cavity|the cribriform plate) (the nasal cavity|the nasopharynx) is composed of pseudostratified ciliated epithelium rich in mucus - secreting goblet cells .",

No op x 2

"(It|The mucous membrane lining the nasal cavity) also includes an extensive network of blood vessels and normally appears pinkish .",

Correct

"As air passes over (the membrane|the nasal cavity) , heat radiates from the blood and warms the air , adjusting its temperature to that of the body .",

No op

"At the same time , evaporation of water from the mucous membrane moistens the air .",
"The sticky mucus secreted by the mucous membrane entraps dust and other small particles entering with the air .",
"As the cilia of the epithelial cells move , they push a thin layer of mucus toward the pharynx , where the mucus and any entrapped particles are swallowed .",
"In the stomach , gastric juice destroys microorganisms in (the mucus|the membrane) , including pathogens .",

No op

"In this way , (the mucous membrane|the mucus) keeps particles from reaching the lower air passages , preventing respiratory infections .",

No op

"Clinical Application 19.1 discusses how cigarette smoking impairs the function of the respiratory system , beginning with the mucus and cilia .",
"Recall from chapter 7 that (the sinuses|the mucus and cilia) are air - filled spaces in the frontal , sphenoid , ethmoid , and maxillary bones of the skull .",

No op

"(The sinuses|the sinuses) reduce the weight of (the skull|the skull) .",

No op x 2

"(They|The sinuses) also serve as resonant chambers that affect the quality of the voice .",

Correct

"(The sinuses|The sinuses) open into (the nasal cavity|the mucous membrane) and are lined with mucous membranes that are continuous with the lining of (the nasal cavity|the mucous membrane) , allowing mucus secretions to drain from (the sinuses|The sinuses) into (the nasal cavity|the mucous membrane) .",

No op x 5

"Membranes that are inflamed and swollen because of nasal infections or allergic reactions may block this drainage , increasing pressure in a sinus and causing headache .",

"It is possible to illuminate a person 's frontal sinus in a darkened room by holding a small flashlight just beneath the eyebrow .",
"Similarly , directing (the flashlight beam|the nasal cavity) into the mouth illuminates (the maxillary sinuses|the nasal cavity) .",

No op x 2

"(The pharynx|the maxillary sinuses) is the space posterior to (the nasal cavity|the maxillary sinuses) , oral cavity , and larynx .",

No op x 2

"(It|the nasal cavity) is a passageway for food moving from (the oral cavity|the nasal cavity) to (the esophagus|the nasal cavity) and for air passing between (the nasal cavity|the nasal cavity) and (the larynx|the nasal cavity) .",

Error: should be "pharynx" not "the nasal cavity"

"(The pharynx|the larynx) also aids in producing the sounds of speech .",

No op x 2

"(It|The pharynx) can be divided into (the nasopharynx|The pharynx) , (the oropharynx|The pharynx) , and (the laryngopharynx|The pharynx) : (The larynx|The pharynx) is an enlargement in the airway superior to (the trachea|The pharynx) , anterior and somewhat inferior to (the laryngopharynx|The pharynx) .",

Correct

No op x 6

"(It|the laryngopharynx) is a passageway for air moving in and out of (the trachea|the laryngopharynx) and prevents foreign objects from entering (the trachea|the laryngopharynx) .",

Correct

No op x 2

"(The larynx|the trachea) also houses the vocal cords .",

No op

"(The larynx|The larynx) is composed of a framework of muscles and cartilages bound by elastic tissue .",

No op

"The larger of (the cartilages|The larynx) are the thyroid , cricoid , and epiglottic cartilages .",

No op

"These structures are single .",
"The other laryngeal cartilages -- the arytenoid , corniculate , and cuneiform cartilages -- are paired .",
"The thyroid cartilage was named for the thyroid gland that covers its lower area .",
"(This cartilage|The thyroid cartilage) is the shieldlike structure that protrudes in the front of the neck and is also called the \" Adam 's apple .",

Correct

imrryr commented 4 years ago

What does live with errors mean? We are making the stim files with this right? So it is not live exactly....

After seeing this I realize this is best deployed as a tool for the teachers. First, we can flag the sentences that have these referents for review by the teacher. This will make it vastly easier for the teacher.

Second, when the teacher goes in to edit these problem items, we can present your optional revisions for them to quickly approve.

How does that sound?

On Tue, Jul 28, 2020 at 3:15 PM Andrew M Olney notifications@github.com wrote:

@imrryr https://github.com/imrryr I've got something working, but it has errors. Here are our options:

  • Live with the errors
  • Replace the coref resolution with something better, which may reduce the errors (2 days effort, expect 5-10% improvement tops)

Below is some sample output with full context. A replacement is indicated with ( original | replacement ). Sample performance summary:

  • 60 sentences
  • 9 correct
  • 5 errors
  • 35 no ops (replacements we would not have made)
  • Replacement is 64% correct
  • 71% of the time, we no op => we no op more than we are correct

Sample performance annotated

"Obtaining oxygen and removing carbon dioxide are the primary functions of the respiratory system .", "(This system|the respiratory system) includes tubes that remove particles from incoming air and transport air into and out of the lungs , as well as microscopic air sacs where gases are exchanged .",

Correct

"(The respiratory system|This system) uses skeletal muscles , which are under voluntary control , but unless you are blowing up party balloons or playing a wind instrument , you may not be aware of the air moving in and out of your lungs .",

No op. We would not change an NP unless it began with this/that/these/those

"Parts of the brainstem control breathing automatically , constantly bringing in oxygen to support aerobic production of ATP and constantly eliminating carbon dioxide .", "(The respiratory system|The respiratory system) produces vocal sounds , participates in the sense of smell , and plays a role in the regulation of blood (pH.pH.|ATP)",

No op

"(The respiratory system|The respiratory system) consists of passages that filter , moisten , and warm incoming air and transport (it|The respiratory system) into the body , into the lungs , and to the many microscopic air sacs where gases are exchanged .",

No op

Error: should be "air" not "the respiratory system"

"The entire process of exchanging gases between the atmosphere and body cells is called respiration .", "(It|the atmosphere and body cells) consists of several events : movement of air in and out of the lungs , commonly called breathing , or ventilation exchange of gases between the air in the lungs and the blood , called external respiration transport of gases by the blood between the lungs and body cells exchange of gases between the blood and the body cells , also called internal respiration oxygen O2 use and production of carbon dioxide (CO2|ATP) by body cells as part of the process of cellular respiration (Respiration|ATP) occurs on a macroscopic level -- (it|ATP) is a function of an organ system .",

Error: should be "respiration" not "the atmosphere and body cells"

No op

No op

Error: should be "respiration" not "ATP"

NOTE: this sentence probably has HTML parsing errors

"However , the reason that body cells must exchange gases -- take up oxygen and release carbon dioxide -- is apparent at the cellular and molecular levels .", "Cellular respiration enables cells to harness energy held in the chemical bonds of nutrient molecules .", "In aerobic reactions , cells liberate energy from (these molecules|the chemical bonds of nutrient molecules) by removing electrons and channeling them through a series of carriers called the electron transport chain , yielding (ATP|Respiration) .",

No op

"At the end of (this chain|the electron transport chain , yielding ATP) , electrons bind oxygen atoms and hydrogen ions to produce water molecules .",

Correct, if you can accept ", yielding ATP" at the end

"Without oxygen , these reactions cease .", "The aerobic reactions also produce (CO2.CO2|ATP) .",

No op

"(CO2CO2|CO2.CO2) combines with water to form carbonic acid , helping to maintain blood pH. Too much (CO2,CO2|CO2.CO2) , however , lowers blood (pH|CO2.CO2) , compromising homeostasis .",

No op x 3

"(The respiratory system|The respiratory system) both provides oxygen for aerobic reactions and eliminates (CO2CO2|pH) at the appropriate rate to maintain the pH of the internal environment .",

No op x 2

"The organs of (the respiratory system|The respiratory system) can be divided into two groups , or tracts .",

No op

"The upper respiratory tract includes the nose , nasal cavity , sinuses , pharynx , and larynx .", "The lower respiratory tract includes the trachea , bronchial tree , and lungs .", "The respiratory structures that air passes through face the outside environment , and except for the parts where exchange of gases takes place , they are lined with mucous membrane .", "The nose is covered with skin and is supported internally by muscle , bone , and cartilage .", "(Its|The nose) two nostrils provide openings through which air can enter and leave (the nasal cavity|the trachea , bronchial tree) .",

Correct if you can accept lack of possessive case. That is something I could return to after concept maps, etc

"Many internal hairs in these openings prevent entry of large particles carried in the air .", "The nasal cavity , a hollow space behind the nose , is divided medially into right and left portions by (the nasal septum|the nasal cavity) .",

No op

"(This cavity|the nasal septum) is separated from (the cranial cavity|the nasal septum) by the cribriform plate of (the ethmoid bone|the nasal septum) and from (the oral cavity|the nasal septum) by the hard palate .",

Correct

No op x 3

"(The nasal septum|the oral cavity) may bend during birth or shortly before adolescence .",

No op

"Such a deviated septum may obstruct (the nasal cavity|The nasal septum) , making breathing difficult .",

No op

"As figure 19.2 shows , nasal conchae curl out from the lateral walls of (the nasal cavity|the nasal cavity) on each side , forming passageways called the superior , middle , and inferior meatuses .",

No op

"(The nasal chonchae|the nasal cavity) support (the mucous membrane that lines the nasal cavity|the cribriform plate of the ethmoid bone) (the nasal cavity|the nasal cavity) .",

No op x 3

"(The conchae|the nasal cavity) also help increase (the mucous membrane 's|the nasal cavity) surface area .",

No op x 2

"The upper posterior portion of (the nasal cavity , below the cribriform plate|the mucous membrane that lines the nasal cavity) (the cribriform plate|the mucous membrane that lines the nasal cavity) , is slitlike , and (its|the mucous membrane 's) lining contains the olfactory receptors that provide the sense of smell .",

No op x 2

**Error: should be "that nasal cavity" not "the mucous membrane 's"

"The remainder of (the cavity|the mucous membrane 's) conducts air to and from (the nasopharynx|the mucous membrane 's) .",

No op x 2

"(The mucous membrane lining the nasal cavity|the cribriform plate) (the nasal cavity|the nasopharynx) is composed of pseudostratified ciliated epithelium rich in mucus - secreting goblet cells .",

No op x 2

"(It|The mucous membrane lining the nasal cavity) also includes an extensive network of blood vessels and normally appears pinkish .",

Correct

"As air passes over (the membrane|the nasal cavity) , heat radiates from the blood and warms the air , adjusting its temperature to that of the body .",

No op

"At the same time , evaporation of water from the mucous membrane moistens the air .", "The sticky mucus secreted by the mucous membrane entraps dust and other small particles entering with the air .", "As the cilia of the epithelial cells move , they push a thin layer of mucus toward the pharynx , where the mucus and any entrapped particles are swallowed .", "In the stomach , gastric juice destroys microorganisms in (the mucus|the membrane) , including pathogens .",

No op

"In this way , (the mucous membrane|the mucus) keeps particles from reaching the lower air passages , preventing respiratory infections .",

No op

"Clinical Application 19.1 discusses how cigarette smoking impairs the function of the respiratory system , beginning with the mucus and cilia .", "Recall from chapter 7 that (the sinuses|the mucus and cilia) are air - filled spaces in the frontal , sphenoid , ethmoid , and maxillary bones of the skull .",

No op

"(The sinuses|the sinuses) reduce the weight of (the skull|the skull) .",

No op x 2

"(They|The sinuses) also serve as resonant chambers that affect the quality of the voice .",

Correct

"(The sinuses|The sinuses) open into (the nasal cavity|the mucous membrane) and are lined with mucous membranes that are continuous with the lining of (the nasal cavity|the mucous membrane) , allowing mucus secretions to drain from (the sinuses|The sinuses) into (the nasal cavity|the mucous membrane) .",

No op x 5

"Membranes that are inflamed and swollen because of nasal infections or allergic reactions may block this drainage , increasing pressure in a sinus and causing headache .",

"It is possible to illuminate a person 's frontal sinus in a darkened room by holding a small flashlight just beneath the eyebrow .", "Similarly , directing (the flashlight beam|the nasal cavity) into the mouth illuminates (the maxillary sinuses|the nasal cavity) .",

No op x 2

"(The pharynx|the maxillary sinuses) is the space posterior to (the nasal cavity|the maxillary sinuses) , oral cavity , and larynx .",

No op x 2

"(It|the nasal cavity) is a passageway for food moving from (the oral cavity|the nasal cavity) to (the esophagus|the nasal cavity) and for air passing between (the nasal cavity|the nasal cavity) and (the larynx|the nasal cavity) .",

Error: should be "pharynx" not "the nasal cavity"

"(The pharynx|the larynx) also aids in producing the sounds of speech .",

No op x 2

"(It|The pharynx) can be divided into (the nasopharynx|The pharynx) , (the oropharynx|The pharynx) , and (the laryngopharynx|The pharynx) : (The larynx|The pharynx) is an enlargement in the airway superior to (the trachea|The pharynx) , anterior and somewhat inferior to (the laryngopharynx|The pharynx) .",

Correct

No op x 6

"(It|the laryngopharynx) is a passageway for air moving in and out of (the trachea|the laryngopharynx) and prevents foreign objects from entering (the trachea|the laryngopharynx) .",

Correct

No op x 2

"(The larynx|the trachea) also houses the vocal cords .",

No op

"(The larynx|The larynx) is composed of a framework of muscles and cartilages bound by elastic tissue .",

No op

"The larger of (the cartilages|The larynx) are the thyroid , cricoid , and epiglottic cartilages .",

No op

"These structures are single .", "The other laryngeal cartilages -- the arytenoid , corniculate , and cuneiform cartilages -- are paired .", "The thyroid cartilage was named for the thyroid gland that covers its lower area .", "(This cartilage|The thyroid cartilage) is the shieldlike structure that protrudes in the front of the neck and is also called the \" Adam 's apple .",

Correct

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aolney/mofacts-automated-authoring/issues/16#issuecomment-665256642, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDLPK34PFAJNI7KZD3Y2XTR54WX5ANCNFSM4KRRGZ4Q .

-- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

imrryr commented 4 years ago

As to a better version, I think maybe it makes sense to prioritize the paraphrases. Does it seem they will have similar problems and need vetting?

imrryr commented 4 years ago

Maybe we should start small with one paraphrase per item? That way the paraphrases won't be impractical to vet.

aolney commented 4 years ago

What does live with errors mean? We are making the stim files with this right? So it is not live exactly.... After seeing this I realize this is best deployed as a tool for the teachers. First, we can flag the sentences that have these referents for review by the teacher. This will make it vastly easier for the teacher. Second, when the teacher goes in to edit these problem items, we can present your optional revisions for them to quickly approve. How does that sound? -- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

"live" as in "to live"

Sure, running this through the teachers makes sense if you think they will actually do it. Let's discuss at our meeting tomorrow, because if we pursue this option, Tackett and I will need to work closely.

imrryr commented 4 years ago

Yes, let's talk about it tomorrow. It does make more sense to expect better vetting since the sets of items will be smaller as we discussed, 5 10 and 15%.

Also, unless you think the correctness rate for the paraphrases will be much better, we already face the problem of making sure ungrammatical/incorrect sentences don't get delivered to students. I think we have this "class" of items that are "synthetic" and suspect as compared to the natural sentences from the text (which are actually in less need of vetting due to our substitution of the ones that are problematic with synthetic alternatives).

So, it seems like we need to flag these synthetics for teachers to expedite the process of review. I also think we need some way to limit the number of paraphrases, perhaps selecting the most likely to be correct, again to make sure the vetting is manageable.

Really happy to discuss any thoughts you have about this.

Ah, the joys of actual implementation....

On Tue, Jul 28, 2020 at 4:53 PM Andrew M Olney notifications@github.com wrote:

What does live with errors mean? We are making the stim files with this right? So it is not live exactly.... After seeing this I realize this is best deployed as a tool for the teachers. First, we can flag the sentences that have these referents for review by the teacher. This will make it vastly easier for the teacher. Second, when the teacher goes in to edit these problem items, we can present your optional revisions for them to quickly approve. How does that sound? -- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

"live" as in "to live"

Sure, running this through the teachers makes sense if you think they will actually do it. Let's discuss at our meeting tomorrow, because if we pursue this option, Tackett and I will need to work closely.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aolney/mofacts-automated-authoring/issues/16#issuecomment-665302618, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDLPK3ZVNMADZ62EWJNIZTR55CEJANCNFSM4KRRGZ4Q .

-- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

aolney commented 4 years ago

UI integration:

{
    "sentences": [
        {
            "sentence": "These chemicals that the endocrine system produces have many and diverse effects on the body.",
            "itemId": 1040437808,
            "hasCloze": true
        }
    ],
    "clozes": [
        {
            "cloze": "The hormones that the __________ __________ produces have many and diverse effects on the body.",
            "itemId": 1040437808,
            "clozeId": -2008774504,
            "correctResponse": "endocrine system",
            "tags": {
                "weightGroup": 13,
                "orderGroup": 0,
                "corefClusters": 1,
                "corefClusterTotalWeight": 17,
                "corefClusterBackwardWeight": 0,
                "corefClusterForwardWeight": 16,
                "rootDistance": 4,
                "startDistance": 3,
                "originalItem": "These chemicals that the __________ __________ produces have many and diverse effects on the body.",
                "transformation": "coref-resolution"
            },
            {
            "cloze": "The __________ __________ produces hormones that have many and diverse effects on the body.",
            "itemId": 1040437808,
            "clozeId": -2008774504,
            "correctResponse": "endocrine system",
            "tags": {
                "weightGroup": 13,
                "orderGroup": 0,
                "corefClusters": 1,
                "corefClusterTotalWeight": 17,
                "corefClusterBackwardWeight": 0,
                "corefClusterForwardWeight": 16,
                "rootDistance": 4,
                "startDistance": 3,
                "originalItem": "These chemicals that the __________ __________ produces have many and diverse effects on the body.",
                "transformation": "paraphrase"
            }
        }
    ]
}

@andrewtackett here is some example json with a coref item and a paraphrase item. FWIW we can cut the original item if you want to reconstruct it using the sentenceId and the correctResponse. Either way is fine with me.

andrewtackett commented 4 years ago

@aolney Actually I think this format is fine. I will note that we just discussed changing transformation into a list instead of a single string though.

aolney commented 4 years ago

Thinking through this, it makes more sense to me to keep cloze as the original item, then add tags for the coref item and paraphrase item if they exist.

That would keep the semantics of cloze constant. Otherwise @andrewtackett will need to check the transformations tag to determine the semantics of cloze, i.e. whether cloze is an:

Instead I propose adding two tags, CorefResolution and Paraphrase, each of which (if they exist) will contain a cloze item with the applicable transformations.

If these tags do not exist, it means that they were identical to the original cloze.

...edit...

Actually I just thought of another edge case: the cloze answer could have a resolved referent, in which case we need a new tag like corefCorrectResponse

aolney commented 4 years ago

@andrewtackett This is actual output that I have now. I think I should probably pause here to confirm the format before generating all the materials:

Sentence 1 has a paraphrase "Grab your fingers"

Sentence 2 has no paraphrase and no coreference resolution

Sentence 3 has coreference resolution

Currently the way things are running we're not layering paraphrase on top of coreference resolution. That's possible, but since I'm doing paraphrases through backtranslation, I'd rather wait to see how well my paraphrase neural network can handle it.

{
    "sentences": [
        {
            "sentence": "Snap your fingers!",
            "itemId": 736755539,
            "hasCloze": true
        },
        {
            "sentence": "John ate a burger.",
            "itemId": -765857134,
            "hasCloze": true
        },
        {
            "sentence": "It tasted good.",
            "itemId": 981979642,
            "hasCloze": true
        }
    ],
    "clozes": [
        {
            "cloze": "Snap your __________!",
            "itemId": 736755539,
            "clozeId": -906031783,
            "correctResponse": "fingers",
            "tags": {
                "weightGroup": 0,
                "clozeParaphraseTransformation": "Grab your __________!",
                "orderGroup": 0,
                "sentenceWeight": 2,
                "clozeProbability": 0.00000608192839360935,
                "syntacticRole": "dobj",
                "rootDistance": 1,
                "startDistance": 2
            }
        },
        {
            "cloze": "John ate a __________.",
            "itemId": -765857134,
            "clozeId": -640779175,
            "correctResponse": "burger",
            "tags": {
                "weightGroup": 0,
                "orderGroup": 0,
                "sentenceWeight": 4,
                "clozeProbability": 0.000006742179232986,
                "corefClusters": 2,
                "corefClusterTotalWeight": 4,
                "corefClusterBackwardWeight": 1,
                "corefClusterForwardWeight": 0,
                "rootDistance": 2,
                "startDistance": 2
            }
        },
        {
            "cloze": "It tasted __________.",
            "itemId": 981979642,
            "clozeId": -795897595,
            "correctResponse": "good",
            "tags": {
                "weightGroup": 0,
                "clozeCorefTransformation": "A burger tasted __________.",
                "correctResponseCorefTransformation": "good",
                "orderGroup": 0,
                "sentenceWeight": 2,
                "clozeProbability": 0.000218306594687214,
                "semanticRole": "ARG2",
                "rootDistance": 1,
                "startDistance": 2
            }
        }
    ]
}
imrryr commented 4 years ago

This requires code logic to notice for each item that the alternative is present if that item is selected. In the case of the resolution, the alternative will be selected with 100% probability, while in the case of the paraphrase, it will be 50% 50% chances.

On Sun, Aug 9, 2020 at 10:39 PM Andrew M Olney notifications@github.com wrote:

@andrewtackett https://github.com/andrewtackett This is actual output that I have now. I think I should probably pause here to confirm the format before generating all the materials:

Sentence 1 has a paraphrase "Grab your fingers"

Sentence 2 has no paraphrase and no coreference resolution

Sentence 3 has coreference resolution

Currently the way things are running we're not layering paraphrase on top of coreference resolution. That's possible, but since I'm doing paraphrases through backtranslation, I'd rather wait to see how well my paraphrase neural network can handle it.

{ "sentences": [ { "sentence": "Snap your fingers!", "itemId": 736755539, "hasCloze": true }, { "sentence": "John ate a burger.", "itemId": -765857134, "hasCloze": true }, { "sentence": "It tasted good.", "itemId": 981979642, "hasCloze": true } ], "clozes": [ { "cloze": "Snap your __!", "itemId": 736755539, "clozeId": -906031783, "correctResponse": "fingers", "tags": { "weightGroup": 0, "clozeParaphraseTransformation": "Grab your __!", "orderGroup": 0, "sentenceWeight": 2, "clozeProbability": 0.00000608192839360935, "syntacticRole": "dobj", "rootDistance": 1, "startDistance": 2 } }, { "cloze": "John ate a __.", "itemId": -765857134, "clozeId": -640779175, "correctResponse": "burger", "tags": { "weightGroup": 0, "orderGroup": 0, "sentenceWeight": 4, "clozeProbability": 0.000006742179232986, "corefClusters": 2, "corefClusterTotalWeight": 4, "corefClusterBackwardWeight": 1, "corefClusterForwardWeight": 0, "rootDistance": 2, "startDistance": 2 } }, { "cloze": "It tasted __.", "itemId": 981979642, "clozeId": -795897595, "correctResponse": "good", "tags": { "weightGroup": 0, "clozeCorefTransformation": "A burger tasted __.", "correctResponseCorefTransformation": "good", "orderGroup": 0, "sentenceWeight": 2, "clozeProbability": 0.000218306594687214, "semanticRole": "ARG2", "rootDistance": 1, "startDistance": 2 } } ] }

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aolney/mofacts-automated-authoring/issues/16#issuecomment-671149148, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDLPK2G43CFUQ2NFYMLSPTR75TV7ANCNFSM4KRRGZ4Q .

-- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

imrryr commented 4 years ago

Is that correct AO?

On Mon, Aug 10, 2020 at 10:19 AM Philip Pavlik imrryr@gmail.com wrote:

This requires code logic to notice for each item that the alternative is present if that item is selected. In the case of the resolution, the alternative will be selected with 100% probability, while in the case of the paraphrase, it will be 50% 50% chances.

On Sun, Aug 9, 2020 at 10:39 PM Andrew M Olney notifications@github.com wrote:

@andrewtackett https://github.com/andrewtackett This is actual output that I have now. I think I should probably pause here to confirm the format before generating all the materials:

Sentence 1 has a paraphrase "Grab your fingers"

Sentence 2 has no paraphrase and no coreference resolution

Sentence 3 has coreference resolution

Currently the way things are running we're not layering paraphrase on top of coreference resolution. That's possible, but since I'm doing paraphrases through backtranslation, I'd rather wait to see how well my paraphrase neural network can handle it.

{ "sentences": [ { "sentence": "Snap your fingers!", "itemId": 736755539, "hasCloze": true }, { "sentence": "John ate a burger.", "itemId": -765857134, "hasCloze": true }, { "sentence": "It tasted good.", "itemId": 981979642, "hasCloze": true } ], "clozes": [ { "cloze": "Snap your __!", "itemId": 736755539, "clozeId": -906031783, "correctResponse": "fingers", "tags": { "weightGroup": 0, "clozeParaphraseTransformation": "Grab your __!", "orderGroup": 0, "sentenceWeight": 2, "clozeProbability": 0.00000608192839360935, "syntacticRole": "dobj", "rootDistance": 1, "startDistance": 2 } }, { "cloze": "John ate a __.", "itemId": -765857134, "clozeId": -640779175, "correctResponse": "burger", "tags": { "weightGroup": 0, "orderGroup": 0, "sentenceWeight": 4, "clozeProbability": 0.000006742179232986, "corefClusters": 2, "corefClusterTotalWeight": 4, "corefClusterBackwardWeight": 1, "corefClusterForwardWeight": 0, "rootDistance": 2, "startDistance": 2 } }, { "cloze": "It tasted __.", "itemId": 981979642, "clozeId": -795897595, "correctResponse": "good", "tags": { "weightGroup": 0, "clozeCorefTransformation": "A burger tasted __.", "correctResponseCorefTransformation": "good", "orderGroup": 0, "sentenceWeight": 2, "clozeProbability": 0.000218306594687214, "semanticRole": "ARG2", "rootDistance": 1, "startDistance": 2 } } ] }

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aolney/mofacts-automated-authoring/issues/16#issuecomment-671149148, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDLPK2G43CFUQ2NFYMLSPTR75TV7ANCNFSM4KRRGZ4Q .

-- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

-- Philip I. Pavlik Jr. imrryr@gmail.com http://optimallearning.org/

aolney commented 4 years ago

@imrryr Yes on the scheduling side, the logic will have to keep track of options, but this is true in any event. By preserving the original item and transformations in separate tags, the scheduler has maximum flexibility to make decisions on the fly IMHO.

@andrewtackett I've got the items ready to go in the format I proposed above, but I can change this to suit other formats as requested. Just let me know :wink:

andrewtackett commented 4 years ago

@aolney This format looks good. Send me the files and I'll make them into stims Phil can upload