knmnyn / ParsCit

An open-source CRF Reference String Parsing Package
http://wing.comp.nus.edu.sg/parsCit
GNU Lesser General Public License v3.0
155 stars 47 forks source link

Order of fields for learning process #1

Closed adibaba closed 13 years ago

adibaba commented 13 years ago

Hello ParsCit team,

is the order of fields of "Chunk tagged data" [1] important for the learning process?

E.g. is this equal for the lerning process? [author] Kevin Smith [/author] [title] My thesis. [/title] [title] My thesis. [/title] [author] Kevin Smith [/author]

Best regards, Adrian

[1] http://aye.comp.nus.edu.sg/parsCit/#gsiso

knmnyn commented 13 years ago

Hi Adrian,

Yes, the order of the fields in the data is actually very important. The CRF learns by pattern recognition and since reference strings obey some ordering constraints, results can differ alot when the fields are presented in different orders. Hope that answers your question.

Cheers,

Min

Min-Yen KAN (Dr) :: Associate Professor :: National University of Singapore :: NUS School of Computing, AS6 05-12, 13 Computing Drive Singapore 117417 :: 65-6516 1885(DID) :: 65-6779 4580 (Fax) :: kanmy@comp.nus.edu.sg (E) :: www.comp.nus.edu.sg/~kanmy (W)

Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.

On Wed, Apr 20, 2011 at 9:39 PM, adibaba reply@reply.github.com wrote:

Hello ParsCit team,

is the order of fields of "Chunk tagged data" [1] important for the learning proccess?

E.g. is this equal for the lerning proccess?

Kevin Smith My thesis. My thesis. Kevin Smith

Best regards, Adrian

[1] http://aye.comp.nus.edu.sg/parsCit/#gsiso

Reply to this email directly or view it on GitHub: https://github.com/knmnyn/ParsCit/issues/1

adibaba commented 13 years ago

Hello Min,

thank you for your quick reply. That helps a lot.

Best regards, Adrian