louismullie / treat

Natural language processing framework for Ruby.
Other
1.36k stars 128 forks source link

Support for better dependency management. #44

Open gouthamvel opened 11 years ago

gouthamvel commented 11 years ago

Why is gem dependency resolved with "Treat::Core::Installer.install 'english' ", couldn't this be a part of gemspec. If support for multiple languages requires additional gems then we can just make it modular by creating new gem called 'treat-german'. one can install this if he needs support for german lang. I think Using this method of manually installing dependency will not take full advantage of 'bundler'.

Would like to know if there was any specific reason why dependency resolved manually.

proposal:

this can be much easy to maintain and will support bundler fully.

louismullie commented 11 years ago

Hey gouthamvel, thanks for your input. I've thought about this possibility, and it does seem like a better way of managing dependencies. One problem it doesn't solve is the downloading of external packages (such as the Stanford JARs), which are too large to put in a gem. Perhaps it would be worthwhile to use the installer just for this, though.

The other problem is compatibility between different Ruby interpreters. For example, the agnostic dependencies include some C-based gems that won't compile on JRuby. These users would have a harder time trying to install Treat on their system.

At this point I'm going to ask for the input of those who have contributed to this library (@bdigital @automatedtendencies @LeFnord @darkphantum @whistlerbrk) on how it could be done better. I would really like to improve the dependency management part of Treat.

gouthamvel commented 11 years ago

"downloading of external packages" this could be a rake task or in-fact it should be configurable(I may download it externally) sure will have to think about supporting MRI and JRuby, any clue on this?

Any irc room to hang around and talk on this?

kshahkshah commented 11 years ago

I agree with the proposal. I think external packages should be managed at the users discretion, but perhaps contributors can work to ensure there is a suggestion on how to install them whether it is via Homebrew, apt, dpkg, yum, etc.

tibbon commented 11 years ago

+1 on a Rake task. More reproducible that way for a server install, where I may or may not be able to login and run IRB easily (using something like Rubber, Chef, etc... where I want to make all installation of Gems automated and automatic across several servers).

gouthamvel commented 11 years ago

I'm working on this but might take some time for a pull req. Anyone else working on this?

tibbon commented 11 years ago

Actually, when I look at the Rake tasks, it appears to be one for it already?

https://github.com/louismullie/treat/blob/master/Rakefile

On Thu, Feb 21, 2013 at 7:30 PM, goutham notifications@github.com wrote:

I'm working on this but might take some time for a pull req. Anyone else working on this?

— Reply to this email directly or view it on GitHubhttps://github.com/louismullie/treat/issues/44#issuecomment-13922058.

tibbon commented 11 years ago

Nevermind, that Rake task doesn't help unless you check out the repo.

louismullie commented 11 years ago

To my understanding, we can't have a system-wide rake task without additional dependencies? Are you thinking of creating a symlink of some sort?

tibbon commented 11 years ago

Hmm.... how does Rails work to allow something like 'rails new foo'? I know you can call that from any directory once the Rails gem is installed. I don't know off the top of my head what allows this... but maybe something like that?

On Thu, Feb 21, 2013 at 7:43 PM, Louis Mullie notifications@github.comwrote:

To my understanding, we can't have a system-wide rake task without additional dependencies? Are you thinking of creating a symlink of some sort?

— Reply to this email directly or view it on GitHubhttps://github.com/louismullie/treat/issues/44#issuecomment-13922465.

gouthamvel commented 11 years ago

'rails new' is a binary(or a shell script exactly) in https://github.com/rails/rails/tree/master/railties/bin This might as well be possible,

louismullie commented 11 years ago

Ok, we are going to start migrating the dependency management to gems. We will add the agnostic dependencies to the treat gemspec in a group called :agnostic.

We will build a gem for each supported language, starting with English, German and French. When I get time to add OpenNLP support (see my git repo), we'll add a few more in there. The gems will be named treat-english, treat-german, treat-french. Each gem will contain a directory with the name of the language inside /lib, as in rack for example: "treat/english". This won't be useful in the library, but it will make things clearer when we get to spec testing individual languages.

We'll leave the code in installer.rb as is until we can get everything working and stable.

For package management (i.e. downloads), I think our easiest bet is to leave it to Ruby. I have already built a pretty robust system to download large files, and it's working well. We'll make a rake task that allows us to call it globally.

One problem with the current dependency management is that users have to copy or re-download packages each time they install. To improve on this, we should download the packages by default to ~/.treat or something like that.

I won't be able to work on this very much in the next weeks, but I trust some of you are interested enough in this library to help us make it better. I think this has tremendous potential if the community starts contributing to it regularly.