bioperl / bioperl-live

Core BioPerl 1.x code
http://bioperl.org
295 stars 182 forks source link

onerous installation #314

Open tjparnell opened 5 years ago

tjparnell commented 5 years ago

BioPerl used to be relatively easy to install with v1.6.9, as the vast majority of external dependencies were merely recommended in the Build.PL script (38 recommended versus 9 required). This was super convenient as a vast majority of those may never actually be needed. If they were needed, the user would quickly discover that and could rectify it.

However, with v1.7 and the move to Dist::Zilla, EVERYTHING (81 modules listed!!! although many are standard) is now REQUIRED, imposing excessive installations of often unnecessary modules. Further complicating this is that some of these (GD, XML) require external C libraries, something that users may not have installed or may not be capable of installing. Using force to install is not really a solution, as that can just create more issues.

This seems to be a huge degradation in usability if users cannot easily install the package. I'm not too familiar with Dist::Zilla, but glancing over the tutorial, I don't see mention of recommended vs required dependencies. If so, this may require creative thinking.

cjfields commented 5 years ago

@tjparnell is there any reason you are installing directly from the Github repository and not from CPAN? You only need Dist::Zilla if you plan on installing from this repository directly, which we do not recommend except for development code.

carandraug commented 5 years ago

I think @tjparnell means the listed dependencies on CPAN, even if you are not installing from a git clone.

Anyway, the whole idea of the 1.7 version was to split bioperl into smaller distributions. Instead of installing hundreds of bioperl modules that don't work because of missing dependencies, you install the bioperl distributions that you need.

Currently, the list of prerequesites is automatically filled because we are using the AutoPrereqs plugin. This can be configured on the dist.ini file. Indeed we already skip some modules and set some to suggested dependencies. So what you want can be done. However, I would argue that it is cleaner to have them in separate distributions to avoid reaching a state where a module is "installed" but does not work because a dependency is not installed.

cjfields commented 5 years ago

@carandraug missed your reply, sorry you are correct; I misread this as needing to install Dist::Zilla for using code from the master branch. Just deleted my last reply, no longer applicable.

@tjparnell as @carandraug mentioned, the modules requiring outside deps are being split into separate repos to remove these 'recommended' dependencies, as they do actually cause issues with installation. I do think we need to evaluate whether any additional dependencies arise solely from scripts (I think GD is in this category), then evaluate whether they are really needed or belong elsewhere.

tjparnell commented 5 years ago

@carandraug and @cjfields, thanks for the replies. I am well aware of the intention to break up the monolithic package into smaller, more manageable packages, which I applaud. I'm also just trying to remind everyone that the current state is painfully awkward. Imagine new users who simply want to install BioPerl for a quick tool or as a prerequisite (Bio::DB::HTS for example) and then face an interminable cpanm installation with 81 distributions installed (!), only to fail because some external dependency isn't available (like GD or XML). That's easily solvable by the likes of us, but not all.

I don't see any easy solution other than to continue splitting. At some point we will reach a point where it's easy to install again.

cjfields commented 5 years ago

@tjparnell let us know if you want to help out, the more people involved the better. Particularly when we have two working on this whenever we can spare free time (which is increasingly constrained, at least for me)

carandraug commented 5 years ago

[...] then face an interminable cpanm installation with 81 distributions installed[...]

It's 81 module and pragmas, not distributions. And the list includes things like strict, utf8, and File::Spec which hardly counts.

Also, this split has removed dependency on ~60 modules which are far more difficult to install. See https://metacpan.org/source/CDRAUG/BioPerl-1.7.5/Changes#L190 . And the split also weakened the dependency on some of the remaining dependencies. You're right the current situation is not ideal, but there's good that came out of it and shows the way to solve the original problem.

cjfields commented 5 years ago

Correct. It is including quite a few modules which are in the perl core distribution. And it’s not a good idea to leave those out, the perl core is also evolving (albeit much slower), eg recent removal of CGI

On May 13, 2019, at 5:14 PM, Carnë Draug notifications@github.com<mailto:notifications@github.com> wrote:

[...] then face an interminable cpanm installation with 81 distributions installed[...]

It's 81 module and pragmas, not distributions. And the list includes things like strict, utf8, and File::Spec which hardly counts.

Also, this split has removed dependency on ~60 modules which are far more difficult to install. See https://metacpan.org/source/CDRAUG/BioPerl-1.7.5/Changes#L190 . And the split also weakened the dependency on some of the remaining dependencies. You're right the current situation is not ideal, but there's good that came out of it and shows the way to solve the original problem.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/bioperl/bioperl-live/issues/314?email_source=notifications&email_token=AAAMA4PR6B6ITJZKYGSK4G3PVHR5HA5CNFSM4G7SQWJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVJW2DI#issuecomment-492006669, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAAMA4JLKJC3ESTA4B3G3LTPVHR5HANCNFSM4G7SQWJQ.

tjparnell commented 5 years ago

Not to belabor the point, but just for clarification, I only pulled out "81 distributions" as an example from a recent test installation of BioPerl-1.7.5 on a fresh, clean install of perl-5.28.1.

tim$ perlbrew use perl-5.28.1@bptest tim$ cpanm BioPerl ... ! Installing the dependencies failed: Module 'XML::LibXML::Reader' is not installed, Module 'XML::LibXML' is not installed, Module 'XML::DOM::XPath' is not installed, Module 'GD' is not installed ! Bailing out the installation for BioPerl-1.7.5. 81 distributions installed

cjfields commented 5 years ago

Ah, that clarifies quite a bit, these are from the installation itself. It might be worth seeing how many additional downstream dependencies get pulled in from the current dependency list just to identify to worst offenders. I think one instance was pulling in Moose.

carandraug commented 5 years ago

I made GD as a suggestion instead of a dependency. That was only used in bp_chaos_plot. The list of prereqs that is onlu used in the scripts can be checked with:

$ comm -23 <(scan-perl-prereqs  bin/* | sort) <(scan-perl-prereqs lib/ | sort)
Bio::Align::Utilities
Bio::DB::Ace
Bio::DB::EMBL
Bio::DB::Fasta
Bio::DB::GenBank
Bio::DB::GenPept
Bio::DB::Registry
Bio::Index::EMBL
Bio::Index::Fasta
Bio::Index::GenBank
Bio::Index::SwissPfam
Bio::Index::Swissprot
Bio::SearchIO::FastHitEventBuilder
Bio::SeqFeature::Tools::TypeMapper
Bio::Tree::Compatible
GD
Getopt::Long
Math::BigFloat
Pod::Usage
XML::Twig
YAML

I'm not a big fan of this solution (listing requirements as suggestions) but it's a bit better if we limit it to scripts. I'd still prefer another solution. Here's an excerpt from the IRC conversation:

pyrimidine: I think we move that one to examples. I'd go a bit further and recommend any of the bin 'scripts' introducing additional non-BioPerl dependencies (apart from those in the perl core) be moved to examples as well. GD is a real pain to install. XS bindings and all (well, pain for newbies)

carandraug: I don't think that addition of extra dependencies should be the reason to demote a script to example. The problem you're trying to solve is bioperl being difficult to install. We didn't demote the modules with tricky dependencies to examples so we shouldn't do it to scripts either

carandraug: make a separate distribution with scripts only, that makes sense to me. Or we simply stop including scripts in bioperl altogether and instead have some other platform for people to contribute their scripts. To be honest, I wonder how many people actually make use of the scripts

ETaSky commented 1 year ago

@carandraug and @cjfields, thanks for the replies. I am well aware of the intention to break up the monolithic package into smaller, more manageable packages, which I applaud. I'm also just trying to remind everyone that the current state is painfully awkward. Imagine new users who simply want to install BioPerl for a quick tool or as a prerequisite (Bio::DB::HTS for example) and then face an interminable cpanm installation with 81 distributions installed (!), only to fail because some external dependency isn't available (like GD or XML). That's easily solvable by the likes of us, but not all.

I don't see any easy solution other than to continue splitting. At some point we will reach a point where it's easy to install again.

Have to upvote this! I am the one "new users who simply want to install BioPerl for a quick tool or as a prerequisite". I am struggling to install BioPerl for PSORTb, which is a prerequisite for another program MetaWibele.

The problem is in cpan, when I type install BioPerl, after a long running screens, the error messages are something like:

4 dependencies missing (XML::DOM,DB_File,XML::Twig,XML::Parser::PerlSAX); additionally test harness failed

And, I don't know where to go from here because install XML::Twig and install XML::Parser and install XML::DOM also failed.

zmughal commented 1 year ago

And, I don't know where to go from here because install XML::Twig and install XML::Parser and install XML::DOM also failed.

@ETaSky, which system are you on? You may be able to install BioPerl via your package manager. If you still want to install via CPAN, then you will at least need to have the Expat library development headers for the above XML libraries.

On Debian, we see:

$ apt-cache show libxml-twig-perl libxml-parser-perl libxml-dom-perl  | grep Depends | grep -oP 'lib[^\s,]+'  | grep -vP -- '-perl$'
libc6
libexpat1

which means that you would need to install libexpat1-dev to get the development headers.