ClearTK / cleartk

Machine learning components for Apache UIMA
http://cleartk.github.io/cleartk/
Other
129 stars 58 forks source link

Introduction documentation feedback #37

Closed bethard closed 9 years ago

bethard commented 9 years ago

Original issue 38 created by ClearTK on 2009-01-30T22:06:22.000Z:

I received the following feedback from a developer new to UIMA, NLP, and machine learning - though otherwise very sharp. I think it would be worthwhile to address all of the points that he makes.

<begin-message> I looked at the wiki earlier and was a bit overwhelmed and abandoned it as my first source of information. After a short conversation with Kevin, it makes a bit more sense, but I think it would benefit from a paragraph or two of introduction. I may be on the edge of the expected audience, but following are some questions I have after looking at the main page and the main wiki page. A better introduction on googleCode may not answer them all.

BTW, Do we have a book in the lab library that introduces machine learning in the context of NLP? I've read Jackson and Moulinier.

http://code.google.com/p/cleartk/

"...feature extraction library" ...like what? POS, named entity, misc relationships?

" ...wrappers..." UIMA wrappers? I'd like to learn more about maximum entropy, support vector machines and conditional random fields, but wouldn't expect that from a ClearTK intro.

...also sequential taggers, chunkers, role labelling and temporal resolution.

 http://code.google.com/p/cleartk/w/list

-What's the Maxent classifier and how is it different than the POS tagger?

-What's a chunk tokenizer and how is that different from other kinds of tokenizers.

bethard commented 9 years ago

Comment #1 originally posted by ClearTK on 2009-02-06T18:53:37.000Z:

<empty>

bethard commented 9 years ago

Comment #2 originally posted by ClearTK on 2009-02-07T03:46:04.000Z:

I have not gone through this message carefully to make sure each point is thoroughly addressed. However, I think the ConceptualOverview wiki page does a reasonably good job of addressing the main issues raised here. Please see the wiki.