Wiki: Extending MyFirstDKProProject

GoogleCodeExporter commented 9 years ago

In preparation of the lecture "Educational NLP" (2013) we want to extend the 
Wiki tutorial "MyFirstDKProProject" by a NER and the Stanford Parser. This is 
currently not provided by the tutorial.

Original issue reported on code.google.com by richard....@googlemail.com on 8 Apr 2013 at 3:04

GoogleCodeExporter commented 9 years ago

More tutorials are great!

Imho the "first steps" tutorial is already way too long. How about adding a 
separate tutorial page for setting up a pipeline with parser and NER? That new 
tutorial could already assume that people know how to set up a DKPro/Maven 
project and could focus on the actual pipeline implementation.

Would it be reasonable to use the OpenNLP parser and NER instead of Stanford in 
a push to limit tutorials mostly to ASL components?

Original comment by richard.eckart on 8 Apr 2013 at 6:29

Added labels: Type-Enhancement
Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 8 Apr 2013 at 7:10

GoogleCodeExporter commented 9 years ago

I like the idea of adding a separate tutorial page for setting up a pipeline 
with parser and NER and would actually prefer that over extending the existing 
First Steps tutorial.

Regarding the parser and NER, I would actually prefer Stanford, because we use 
it in practice - there might be some reasons why we typically use Stanford 
components rather than OpenNLP components?

Original comment by eckle.kohler on 8 Apr 2013 at 7:35

GoogleCodeExporter commented 9 years ago

The Stanford components probably produce better output, but we don't have any 
tests that do actually show that.

Original comment by richard.eckart on 8 Apr 2013 at 7:40

GoogleCodeExporter commented 9 years ago

The First-Steps tutorial uses runPipeline to do the analysis, which means that 
a reader and a writer are required. In conversations with some users, I noticed 
that there is a desire to use the analysis results directly in the code after 
the pipeline has run, without first saving stuff to files and then reading it 
again. 

What is supposed to happen with the analysis results in the new tutorial?

Original comment by richard.eckart on 9 Apr 2013 at 10:05

GoogleCodeExporter commented 9 years ago

what about the ClearNLP dependency parser? do we have any experiences with that?

Original comment by eckle.kohler on 9 Apr 2013 at 7:26

GoogleCodeExporter commented 9 years ago

Chris had a student compare different dependency parsers. I believe the MATE 
parser came out best with malt in the second place. I see no strong reasons 
against using GPL components if it serves your purposes. Since the DKPro 
components are easily exchangeable, an example for a GPL component should work 
with very little modification for a non-GPL component of the same kind.

Original comment by richard.eckart on 9 Apr 2013 at 7:44

GoogleCodeExporter commented 9 years ago

Having considered the dicussion and input so far, I suggest to write a tutorial 
that describes:
- processing a small paragraph with Stanford CoreNLP components:  
StanfordSegmenter, StanfordNamedEntityRecognizer, StanfordParser
- and writing out the noun phrases (NP) and Named Entities (NE) ocurring in the 
NPs into a file (CSV format), such as e.g.
NP   the new product of UKP Lab   NE UKP Lab
NP   the latest announcements in the news   NE -

This implies describing how to handle the phrase structure parse tree, rather 
than the dependencies representation.

If nobody objects until tomorrow 10 am, we will proceed this way.

Original comment by eckle.kohler on 10 Apr 2013 at 5:58

GoogleCodeExporter commented 9 years ago

Ok, sounds like some "writer" component will be written for writing and the 
processing happens, as usual, with SimplePipeline. I'll try, at some point, to 
write some documentation on different modes of running components (e.g. with 
JCasIterable) in the uimaFIT docs.

Original comment by richard.eckart on 10 Apr 2013 at 6:34

GoogleCodeExporter commented 9 years ago

Examples need to be upgraded to DKPro Core 1.5.0/uimaFIT 2.0.0.

Original comment by richard.eckart on 14 Aug 2013 at 9:20

GoogleCodeExporter commented 9 years ago

@Nico: anything left to do here?

Original comment by richard.eckart on 29 Sep 2013 at 2:58

Changed state: Accepted
Added labels: Milestone-1.5.0

GoogleCodeExporter commented 9 years ago

@Richard: I don't think that it is upgraded to DKPro Core 1.5.0 and uimaFIT 
2.0.0. Besides that there is nothing left to do any more.

Original comment by nico.erbs@gmail.com on 29 Sep 2013 at 7:03

GoogleCodeExporter commented 9 years ago

Updated to DKPro Core 1.5.0.

Original comment by richard.eckart on 1 Oct 2013 at 3:39

Changed state: Fixed

xiaoyangren / dkpro-core-asl

Wiki: Extending MyFirstDKProProject #128