sergey-tihon / Stanford.NLP.NET

Stanford NLP for .NET
http://sergey-tihon.github.io/Stanford.NLP.NET/
MIT License
595 stars 123 forks source link

How to connect Stanford Core NLP Server and view output? #83

Closed satishkumarkt closed 6 years ago

satishkumarkt commented 6 years ago

I am trying to use Stanford NLP for.NET. I am very new to this.

I am able to connect Stanford core NLP server using below code. How to print NER? Is I am doing it correct way, can anyone provide some example code to do this Thanks

  ```
        var props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner");

        StanfordCoreNLPClient pipeline = new StanfordCoreNLPClient(props, "http://localhost", 9000, 2);
        var text = "Kosgi Santosh sent an email to Stanford University. He didn't get a reply.";
        var annotation = new edu.stanford.nlp.pipeline.Annotation(text);
        pipeline.annotate(annotation);
sergey-tihon commented 6 years ago

I never tried built-in client impl ... but I guess that response annotation should be very similar to the response from the in-process execution.

Here is the sample, the approach is

https://github.com/sergey-tihon/Stanford.NLP.NET/blob/master/tests/Stanford.NLP.CoreNLP.FSharp.Tests/CoreNLP.fs#L18-L32

satishkumarkt commented 6 years ago

Thanks @sergey-tihon when I try the following code It gives an error.

var tokens = sentence.get(new CoreAnnotations.TokensAnnotation().getClass());// if i use as Arraylist here it reurns null
                Console.WriteLine(tokens);  

                foreach (CoreLabel token in tokens) //ERROR  - foreach statement cannot operate on variables of type 'object' because 'object' does not contain a public definition for 'GetEnumerator'
                {
                    var word = token.get(new CoreAnnotations.TextAnnotation().getClass());
                    var pos = token.get(new CoreAnnotations.PartOfSpeechAnnotation().getClass());
                    var ner = token.get(new CoreAnnotations.NamedEntityTagAnnotation().getClass());
                    var normalizedner = token.get(new CoreAnnotations.NormalizedNamedEntityTagAnnotation().getClass());
                    //var time = token.get(new TimeExpression.Annotation().getClass()) as TimeExpression;
                    Console.WriteLine("{0} \t[pos={1}; \tner={2}; \tnormner={3}", word, pos, ner, normalizedner);
                }

If I type cast tokens to CoreLable it gives Unable to cast object of type 'SubList' to type 'System.Collections.Generic.List`1[edu.stanford.nlp.ling.CoreLabel]'.'

can you please help

sergey-tihon commented 6 years ago

It should be java.util.ArrayList..., you can check in the debugger the actual type of tokens object or call tokens.GetType() and print it in console.

Annotations contain Java data structure, so you cannot cast then to .NET collections and cannot use generic collections.

satishkumarkt commented 6 years ago

Sorry to interrupt you

If I use java.util.ArrayList it gives error as follows System.NullReferenceException: 'Object reference not set to an instance of an object

var tokens = sentence.get(new CoreAnnotations.TokensAnnotation().getClass()); Console.WriteLine(tokens);

above stmt prints [Kosgi-1, Santosh-2, sent-3, an-4, email-5, to-6, Stanford-7, University-8, .-9] [He-1, did-2, n't-3, get-4, a-5, reply-6, .-7]

If I use java.util.ArrayList it prints nothing

my entire code is

var props = new Properties();
            props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner");

            StanfordCoreNLPClient pipeline = new StanfordCoreNLPClient(props, "http://localhost", 9000, 2);
            var text = "Kosgi Santosh sent an email to Stanford University. He didn't get a reply.";
            var annotation = new edu.stanford.nlp.pipeline.Annotation(text);
            pipeline.annotate(annotation);

            var sentences = annotation.get(new CoreAnnotations.SentencesAnnotation().getClass()) as java.util.ArrayList;
            Console.WriteLine(sentences);

            foreach (CoreMap sentence in sentences)
            {
                var tokens = sentence.get(new CoreAnnotations.TokensAnnotation().getClass()) as java.util.ArrayList; ;
                Console.WriteLine(tokens);
                Console.WriteLine("----");
                foreach (CoreLabel token in tokens)
                {

                    var word = token.get(new CoreAnnotations.TextAnnotation().getClass());
                    var pos = token.get(new CoreAnnotations.PartOfSpeechAnnotation().getClass());
                    var ner = token.get(new CoreAnnotations.NamedEntityTagAnnotation().getClass());
                    var normalizedner = token.get(new CoreAnnotations.NormalizedNamedEntityTagAnnotation().getClass());
                    //var time = token.get(new TimeExpression.Annotation().getClass()) as TimeExpression;
                    Console.WriteLine("{0} \t[pos={1}; \tner={2}; \tnormner={3}", word, pos, ner, normalizedner);
                }
            }
sergey-tihon commented 6 years ago

Unable to cast object of type 'SubList' to type

Could you try cast to SubList?

Or put the breakpoint at the line with Console.WriteLine(sentences); and check the full type name in the debugger?

satishkumarkt commented 6 years ago

sentences type is {Name = "ArrayList" FullName = "java.util.ArrayList"} tokens type is {Name = "SubList" FullName = "java.util.ArrayList+SubList"}

How to do cast to SubList?

sergey-tihon commented 6 years ago

the first idea is that SubList should be an internal class, so you may try java.util.ArrayList.SubList if it does not work... you may try to cast/coerse to interface java.util.List

satishkumarkt commented 6 years ago

can you please provide some example code to cast java.util.ArrayList+SubList and how to iterate this to get NER Is there any other way to handle to connect NLP server from c#?

sergey-tihon commented 6 years ago

I am on VAC this week without access to my workstation, so I can take a look at sample only next week.

Is there any other way to handle to connect NLP server from c#?

I do not know others, I always run it in-process. But you always could call REST API directly (I guess so)

sergey-tihon commented 6 years ago

In order to workaround issue with java.util.ArrayList+SubList you could you cast to the common base class called java.util.AbstractList and then iterate the list.

The following sample works fine for me with server started as

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
using edu.stanford.nlp.ling;
using edu.stanford.nlp.pipeline;
using edu.stanford.nlp.util;
using System;

namespace standfordnlp
{
    class StanfordCoreNlpClient
    {
        readonly static java.lang.Class sentencesAnnotationClass = new CoreAnnotations.SentencesAnnotation().getClass();
        readonly static java.lang.Class tokensAnnotationClass = new CoreAnnotations.TokensAnnotation().getClass();
        readonly static java.lang.Class textAnnotationClass = new CoreAnnotations.TextAnnotation().getClass();
        readonly static java.lang.Class partOfSpeechAnnotationClass = new CoreAnnotations.PartOfSpeechAnnotation().getClass();
        readonly static java.lang.Class namedEntityTagAnnotationClass = new CoreAnnotations.NamedEntityTagAnnotation().getClass();
        readonly static java.lang.Class normalizedNamedEntityTagAnnotation = new CoreAnnotations.NormalizedNamedEntityTagAnnotation().getClass();

        // Sample from https://stanfordnlp.github.io/CoreNLP/corenlp-server.html
        static void Main()
        {
            // creates a StanfordCoreNLP object with POS tagging, lemmatization, NER, parsing, and coreference resolution
            var props = new java.util.Properties();
            props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
            StanfordCoreNLPClient pipeline = new StanfordCoreNLPClient(props, "http://localhost", 9000, 2);
            // read some text in the text variable
            var text = "Kosgi Santosh sent an email to Stanford University.";
            // create an empty Annotation just with the given text
            Annotation document = new Annotation(text);
            // run all Annotators on this text
            pipeline.annotate(document);

            var sentences = document.get(sentencesAnnotationClass) as java.util.AbstractList;
            foreach (CoreMap sentence in sentences)
            {
                var tokens = sentence.get(tokensAnnotationClass) as java.util.AbstractList;
                Console.WriteLine("----");
                foreach (CoreLabel token in tokens)
                {
                    var word = token.get(textAnnotationClass);
                    var pos = token.get(partOfSpeechAnnotationClass);
                    var ner = token.get(namedEntityTagAnnotationClass);
                    Console.WriteLine("{0}\t[pos={1};\tner={2};", word, pos, ner);
                }
            }
        }
    }
}
sergey-tihon commented 6 years ago

I added this example to the site - https://sergey-tihon.github.io/Stanford.NLP.NET//samples/CoreNLP.Server.html

I hope that my sample works for you. Please reopen this issue if you have more questions

satishkumarkt commented 6 years ago

Ya, it works. Thanks for your support @sergey-tihon

jgazelle commented 5 years ago

Hi,I am getting sentences as null in the below line..Can someone help? var sentences = document.get(sentencesAnnotationClass) as java.util.AbstractList;