banditelol / stog

AMR Parsing as Sequence-to-Graph Transduction, a fork for implementation using Stanza instead of CoreNLP
MIT License
0 stars 1 forks source link

Stanza vs CoreNLP #2

Open banditelol opened 3 years ago

banditelol commented 3 years ago

Perbedaan antara Stanza dan CoreNLP, karena blm berhasil untuk host CoreNLP

NER Stanza tidak mendapatkan tag "Money"

image

# ::snt take a £20 note on the bus, they just tell you to get on cos theyre lazy as hell
# ::tokens ["take", "a", "\u00a3", "10", "note", "on", "the", "bus", ",", "they", "just", "tell", "you", "to", "get-on", "cos", "theyre", "lazy", "as", "hell"]
# ::lemmas ["take", "a", "\u00a3", "10", "note", "on", "the", "bus", ",", "they", "just", "tell", "you", "to", "get-on", "cos", "theyre", "lazy", "as", "hell"]
# ::pos_tags ["VB", "DT", "$", "CD", "NN", "IN", "DT", "NN", ",", "PRP", "RB", "VBP", "PRP", "TO", "COMP", "NN", "NNS", "JJ", "IN", "NN"]
# ::ner_tags ["O", "O", "O", "NUMBER", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "0", "O", "O", "O", "O", "O"]

Gimana Handle Unicode Characters?

Soalnya di beberapa sentences ada character unicode yang somehow gabisa di recognize sama charmap nya ini masih diexplore