taku910 / crfpp

CRF++: Yet Another CRF toolkit
Other
505 stars 192 forks source link

template file: unigram #28

Open csrgxtu opened 8 years ago

csrgxtu commented 8 years ago

how can u explain the meaning of the following template:

U15:%x[-2,1]/%x[-1,1]
garfieldnate commented 8 years ago

U15 is the feature id/name. The %x[...] sections are templates specifying what parts of the data vectors to make in the feature. The first number is the word index relative to the current word. The second number is the index of the feature. The numbers start at 0, not 1.

Let's say your features look like this:

Word POS
I pronoun
am verb
very adverb
tired adjective

This is a 2-d data vector extracted from the sentence "I am very tired." When CRF++ is trying to classify the word "very", %x[-2,1] would refer to two tokens previous, the second feature (POS). %x[-1,1] would be the previous token, second feature (POS). So in this example the value given by your template would be pronoun/verb.

zhugw commented 7 years ago

@garfieldnate Thanks. I have another question: does below are same?

U15:%x[-2,1]/%x[-1,1]
U15:%x[-2,1]%x[-1,1] # without '/'

and except '/' does has any other sign could exist in template e.g. + - | and so on and what is the mean

garfieldnate commented 7 years ago

My guess is that the first would generate tokens like "a/cat" and the second would generate tokens like "acat". I would leave the slash there, because without it the bigram for "a way" will look the same as the unigram for "away". I don't know if you can use other characters instead. You should try it and tell us if it works. I haven't been using this project for a while (moved to new job).

fjwu commented 6 years ago

i am new to crf++ and i wan to perform testing and traing.but when i type vim crf_learn the terminal opens a blank screen demanding for input .I donot know what to type here.Can you help me to resolve this issue??thanks

ankur220693 commented 6 years ago

@garfieldnate I have converted my tagged data into vector form Can I use Vector as a input in CRF++ My vector file is in (.txt) format vector

garfieldnate commented 6 years ago

@ankur220693 I'm sorry I haven't been using this project for over 2 years now. I do not know the answer!

How did you create this "Vector" format? What program did you use?