Closed Millymanz closed 6 years ago
I have the test in F# that check dependencies
Annotation pipeline:
let props = Properties()
props.setProperty("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref") |> ignore
props.setProperty("sutime.binders","0") |> ignore
props.setProperty("ner.useSUTime","0") |> ignore
then you extract sentences from Annotation
:
let sentences = annotation.get(CoreAnnotations.SentencesAnnotation().getClass()) :?> java.util.ArrayList
and then dependencies between words - open
printfn "\nDependencies:"
let deps = sentence.get(SemanticGraphCoreAnnotations.CollapsedDependenciesAnnotation().getClass()) :?> SemanticGraph
Expect.isNotNull deps "Semantic graph is null"
for edge in deps.edgeListSorted().toArray() |> Seq.cast<SemanticGraphEdge> do
let gov = edge.getGovernor()
Expect.isNotNull gov "Governor is null"
let dep = edge.getDependent()
Expect.isNotNull dep "Dependent is null"
printfn "%O(%s-%d,%s-%d)"
(edge.getRelation())
(gov.word()) (gov.index())
(dep.word()) (dep.index())
Is it what you are looking for?
Hi, this is not what I am looking for. I want to know how to use
Dictionary<int, CorefChain> coref = document.get(new CorefCoreAnnotations.CorefChainAnnotation().getClass()) as Dictionary<int, CorefChain>;
I am trying to do the following example which is in JAVA, but I want to do it in C# or F# - Coreferencing
Map<Integer, CorefChain> coref = document.get(CorefChainAnnotation.class);
for(Map.Entry<Integer, CorefChain> entry : coref.entrySet()) {
CorefChain c = entry.getValue();
//this is because it prints out a lot of self references which aren't that useful
if(c.getCorefMentions().size() <= 1)
continue;
CorefMention cm = c.getRepresentativeMention();
String clust = "";
List<CoreLabel> tks = document.get(SentencesAnnotation.class).get(cm.sentNum-1).get(TokensAnnotation.class);
for(int i = cm.startIndex-1; i < cm.endIndex-1; i++)
clust += tks.get(i).get(TextAnnotation.class) + " ";
clust = clust.trim();
System.out.println("representative mention: \"" + clust + "\" is mentioned by:");
for(CorefMention m : c.getCorefMentions()){
String clust2 = "";
tks = document.get(SentencesAnnotation.class).get(m.sentNum-1).get(TokensAnnotation.class);
for(int i = m.startIndex-1; i < m.endIndex-1; i++)
clust2 += tks.get(i).get(TextAnnotation.class) + " ";
clust2 = clust2.trim();
//don't need the self mention
if(clust.equals(clust2))
continue;
System.out.println("\t" + clust2);
}
}
@Millymanz do you have full java sample? (with annotation pipeline definition)
unfortunately I dont have the pipeline definition.
This is the best I got for the annotation:
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
props.put("dcoref.score", true);
pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation("The atom is a basic unit of matter, it consists of a dense central nucleus surrounded by a cloud of negatively charged electrons.");
pipeline.annotate(document);
I've added sample from this https://stanfordnlp.github.io/CoreNLP/coref.html - here it is https://github.com/sergey-tihon/Stanford.NLP.NET/commit/9881323b0fb8ce2198afc9907fbe414d75a8cfc7
Seems that pipeline should be tokenize,ssplit,pos,lemma,ner,parse,mention,coref
And for the sentence
Barack Obama was born in Hawaii. He is the president. Obama was elected in 2008.
you should see
the result is similar to corenlp.run
the full source code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using java.util;
using java.io;
using edu.stanford.nlp.coref;
using edu.stanford.nlp.ling;
using edu.stanford.nlp.pipeline;
using edu.stanford.nlp.util;
using Console = System.Console;
using edu.stanford.nlp.coref.data;
using java.lang;
using System.IO;
namespace standfordnlp
{
class CorefAnnotator
{
// Sample from https://stanfordnlp.github.io/CoreNLP/coref.html
static void Main()
{
var jarRoot = @"..\..\..\..\data\paket-files\nlp.stanford.edu\stanford-corenlp-full-2017-06-09\models";
Annotation document = new Annotation("Barack Obama was born in Hawaii. He is the president. Obama was elected in 2008.");
Properties props = new Properties();
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,mention,coref");
props.setProperty("ner.useSUTime", "0");
var curDir = Environment.CurrentDirectory;
Directory.SetCurrentDirectory(jarRoot);
var pipeline = new StanfordCoreNLP(props);
Directory.SetCurrentDirectory(curDir);
pipeline.annotate(document);
var corefChainAnnotation = new CorefCoreAnnotations.CorefChainAnnotation().getClass();
var sentencesAnnotation = new CoreAnnotations.SentencesAnnotation().getClass();
var corefMentionsAnnotation = new CorefCoreAnnotations.CorefMentionsAnnotation().getClass();
Console.WriteLine("---");
Console.WriteLine("coref chains");
var corefChain = document.get(corefChainAnnotation) as Map;
foreach (CorefChain cc in corefChain.values().toArray()) {
Console.WriteLine($"\t{cc}");
}
var sentences = document.get(sentencesAnnotation) as ArrayList;
foreach (CoreMap sentence in sentences.toArray()) {
Console.WriteLine("---");
Console.WriteLine("mentions");
var corefMentions = sentence.get(corefMentionsAnnotation) as ArrayList;
foreach (Mention m in corefMentions) {
Console.WriteLine("\t" + m);
}
}
}
}
}
Thank you so much for the example. You are a star.
I have been trying for a while to implement a method that can perform coreferencing using the stanford.nlp.net. Been trying to test sentences such as "Barack Obama was born in Hawaii. He is the president. Obama was elected in 2008" Or "Which Apple supplier’s share price goes up the most when the company releases a new product?"
P.S Note I keep getting a null value for coref. I have other code which takes care of the pipeline etc the whole setting up of the libraries that I am not showing in this example.
Can you run through a working example of how to fully implement coreferencing?
Thanks