learntextvis / textkit

Command line tool for manipulating and analyzing text
MIT License
28 stars 6 forks source link

[In Progress] POS #20

Closed iros closed 8 years ago

iros commented 8 years ago

Working on POS tagging.

So far added install_dependencies for nltk packages:

textkit install_dependencies

Will do the trick.

vlandham commented 8 years ago

would it be possible to call the public facing command download or downloaddata ?

The download would make it a bit more explicit that this command is getting something from a remote location. "install" perhaps might be more appropriate when compiling or configuring a piece of software?

I've also been trying to avoid dashes or underscores in the names. Perhaps this is misled - but when they are used in other tools, i always forget which one (dash or underscore) is used.

vlandham commented 8 years ago

The other thing we could do is just copy over the necessary files into our data directory. This has already been done for the stop word lists. If the file size is small, this would simplify the users interaction with textkit and ensure we always had a consistent location for the necessary data files.

iros commented 8 years ago

@vlandham, pushed a commit to rename to "download" and replaced the prints with echo.

As far as putting things in the data dir, I will investigate.

iros commented 8 years ago

@vlandham I pushed a fix that added pos tagging. I tried your suggestion of must adding the specific tagger to the data dir and that works. If you're cool with this approach, I will remove the download task. The only challenge might be version mismatch between our tagger and the nltk version someone might already have on their machine. Not sure if that's actually something to be concerned over.

Let me know what you think.

iros commented 8 years ago

Well, I screwed this up majorly by trying to remove that one file... UGH. Sorry ya'll. I'll move this into another PR later. Otherwise the code is ready for review, so we can do that here if that makes sense @vlandham.

vlandham commented 8 years ago

looks great and excited to have it!

do we still need textkit/install_dependencies.py ? it is still in the modified files - but I think you replaced it with download right?

vlandham commented 8 years ago

Closing this one - as it has been already done in #30