sloria / textblob-aptagger

*Deprecated* A fast and accurate part-of-speech tagger for TextBlob.
MIT License
104 stars 41 forks source link

Unable to detect quotes #2

Open ghost opened 10 years ago

ghost commented 10 years ago

While using your tagger, I am getting good results.

However, when it comes to quotes, such as inch-symbols and quoted text, the tagger is completely ignoring the quotation marks, making it difficult for me to work with these cases.

Is there a way to make sure that the quotes are tagged as quotes, like in stanfordparser?

NickShahML commented 8 years ago

To add onto this, is there a way for the perceptron tagger to tag all punctuation? For example, can it tag periods, question marks, quotes, and all text it comes across? Would be a really big help. Currently second best that does punctuation is NTLK tagger but yours is much better.

syllog1sm commented 8 years ago

It should tag all punctuation, but it'll have trouble with unicode entities.

I'm not supporting this code unfortunately anymore --- I'm working full time on spaCy, which is now under the MIT license too ( http://spacy.io ). SpaCy handles non-ascii characters appropriately, and is both faster and more accurate.

NLTK have recently agreed to use this tagger. However, I dont know how well they support unicode punctuation.

On Thursday, October 15, 2015, LeavesBreathe notifications@github.com wrote:

To add onto this, is there a way for the perceptron tagger to tag all punctuation? For example, can it tag periods, question marks, quotes, and all text it comes across? Would be a really big help. Currently second best that does punctuation is NTLK tagger but yours is much better.

— Reply to this email directly or view it on GitHub https://github.com/sloria/textblob-aptagger/issues/2#issuecomment-148368732 .