CompEvol / beast2

Bayesian Evolutionary Analysis by Sampling Trees
www.beast2.org
GNU Lesser General Public License v2.1
238 stars 84 forks source link

Automatic detection of package directories gives false positives #95

Closed tgvaughan closed 9 years ago

tgvaughan commented 10 years ago

The method AddOnManager.getBeastDirectories() returns a list of directories that "may contain packages". All jar files within the lib/ subdirectory of these directories are loaded by AddOnManager.loadExternalJars(). If a jar file contains a class that is already in the class path, that jar file will not be loaded.

One of the criteria for a directory to be included in the list returned by getBeastDirectories() is that the directory is a subdirectory of the current directory (or whatever System.getProperty("user.dir") returns on your system) and that it contains a "/lib" or "/templates" directory.

I contend that this is not a strict enough criterion, as /lib in particular is a very common directory name on nix systems. As the "user.dir" subdirectories are included before the official package directories (eg $HOME/.beast/2.1/), jars detected here may prevent the installed package jars containing at least one identical class from being loaded. This could cause beast to behave in surprising ways if, for instance, an installed package used a different version of a library than the one picked up from a directory such as "user.dir"/BLAH/lib.

An alternative might be to look for (and ideally parse) "version.xml".

The related question of how to handle real packages using different versions of the same library would still exist though.

rbouckaert commented 10 years ago

The aim is to pick up all jars from locations that are convenient both in production and for developers.

Checking for the existence of version.xml would ensure it only picks up things from packages that have a version.xml file -- there is not version.xml for BEAST though (yet).

Perhaps getting rid of the user.dir directory would be sufficiently convenient.

tgvaughan commented 9 years ago

This is still a problem. If I run a beast tool (eg appstore, beauti, beast) from my home directory, the tool reads in all libraries it finds in the ~/code/lib directory. This can cause (has caused!) hard-to-diagnose problems when one of these libraries is a different version to a library already part of a beast package.

Another example is the new "Set working dir" submenu in BEAUti's file menu. This new menu is awesome, but the first "package" it lists for me is ".lyx". This shows up because I have a program installed, LyX, which stores user-specific configuration/caching stuff in ~/.lyx. I don't know exactly which is the culprit, but ~/.lyx contains a number of subdirectories ~/.lyx/templates and ~/.lyx/examples.

I really don't think this will just be a Linux issue btw.

Surely there's a more robust way of picking up packages? If acquiring the BEAST core libraries is the only reason we don't just look for version.xml, isn't there another way to find where the core libraries are? Don't we "just know"?

If picking up libraries during package development is an issue, it's probably better solved through the use of user-specified properties rather than by making beast crawl all over the file system picking up libraries from anything that looks like it could be a beast package.

tgvaughan commented 9 years ago

Not that looking for version.xml is all that robust either. I'd prefer explicitly telling beast that packages are only to be found in the user or system package directories, and using a property to specify alternate locations. (One such property already exists: beast.user.package.dir, which can be used to provide an alternative user package directory.)

alexeid commented 9 years ago

I agree. Packages should be in one or two well-defined locations. Fossicking around for packages all over the file system is a bad idea.

On 23/06/2015, at 11:46 am, Tim Vaughan notifications@github.com wrote:

This is still a problem. If I run a beast tool (eg appstore, beauti, beast) from my home directory, the tool reads in all libraries it finds in the ~/code/lib directory. This can cause (has caused!) hard-to-diagnose problems when one of these libraries is a different version to a library already part of a beast package.

Another example is the new "Set working dir" submenu in BEAUti's file menu. This new menu is awesome, but the first "package" it lists for me is ".lyx". This shows up because I have a program installed, LyX, which stores user-specific configuration/caching stuff in ~/.lyx. I don't know exactly which is the culprit, but ~/.lyx contains a number of subdirectories ~/.lyx/templates and ~/.lyx/examples.

I really don't think this will just be a Linux issue btw.

Surely there's a more robust way of picking up packages? If acquiring the BEAST core libraries is the only reason we don't just look for version.xml, isn't there another way to find where the core libraries are? Don't we "just know"?

If picking up libraries during package development is an issue, it's probably better solved through the use of user-specified properties rather than by making beast crawl all over the file system picking up libraries from anything that looks like it could be a beast package.

— Reply to this email directly or view it on GitHub https://github.com/CompEvol/beast2/issues/95#issuecomment-114305438.

alexeid commented 9 years ago

agree.

On 23/06/2015, at 11:49 am, Tim Vaughan notifications@github.com wrote:

Not that looking for version.xml is all that robust either. I'd prefer explicitly telling beast that packages are only to be found in the user or system package directories, and using a property to specify alternate locations. (One such property already exists: beast.user.package.dir, which can be used to provide an alternative user package directory.)

— Reply to this email directly or view it on GitHub https://github.com/CompEvol/beast2/issues/95#issuecomment-114306211.

rbouckaert commented 9 years ago

Sounds good to me.

mmatschiner commented 7 years ago

Just to let you know, I actually liked the detection of package directories before v.2.3.1. The reason is that I tend to run BEAST2 on the command line in "autonomous" directories in which I place the XML, the beast.jar file, the respective package directories, and a start script. That way I can simply copy the directory on any server and I don't have to ensure that the packages or even BEAST2 are properly installed on that server. The IT stuff running the server would probably get annoyed quickly if I ask them to install updates whenever I want to try a new version, and I would get annoyed waiting for them to do it. I solved this for myself by compiling BEAST2 on my machine after adding the line "dirs.add(System.getProperty(\"user.dir\"));" into beast2/src/beast/util/AddOnManager.java, where it used to be.

tgvaughan commented 7 years ago

Hi Michael, sorry to hear that it's proven difficult. The problem with the old system was that it resulted in some very unexpected behaviour that depended on local directory layouts.

You actually can specify a custom location for packages by setting the beast.user.package.dir property. This should solve your problem without you needing to compile a custom version. We definitely need to publicize this option more (or make it an actual beast command-line argument).

alexeid commented 7 years ago

Shall we log an issue to create this as a documented command line option?

On 22/02/2017, at 9:16 AM, Tim Vaughan notifications@github.com wrote:

Hi Michael, sorry to hear that it's proven difficult. The problem with the old system was that it resulted in some very unexpected behaviour that depended on local directory layouts.

You actually can specify a custom location for packages by setting the beast.user.package.dir property. This should solve your problem without you needing to compile a custom version. We definitely need to publicize this option more (or make it an actual beast command-line argument).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CompEvol/beast2/issues/95#issuecomment-281466865, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3WSQSK7CmwKrSGLYGVVFaO8d95gTmVks5re0YMgaJpZM4Bw0Jn.

mmatschiner commented 7 years ago

Great that the option exists, but honestly I'm still not sure how to specify it. In the XML? Or on the command line? In any case it would be great to have it as a command-line argument that is explained when using "java -jar beast.jar -help".

mmatschiner commented 7 years ago

Ok, I found out that it's "java -jar -Dbeast.user.package.dir=DIRECTORY beast.jar". Great! And thanks for the quick response. But if other users run into the same issue, the explained command-line argument would surely be helpful for them.

tgvaughan commented 7 years ago

Hi Michael and Alexei, yeah - having this as a command line option makes sense.

As an aside, it would be really helpful to have an actual official BEAST 2 manual. I can't believe I only just thought of this. It summarizes the vague feeling of discontent I've always had with the documentation: there's heaps of it, and it's nicely written, but it's scattered across multiple blog posts and web pages so it's very difficult to navigate. Having a single source for the documentation of the core aspects of BEAST would be awesome. (The book is brilliant, but IMO the value there is all of the biological and mathematical context it contains. What I'm talking about is a much shorter technical document/webpage organized according to subject matter.)

alexeid commented 7 years ago

Why don’t you start a github for the manual and Remco, I and others can contribute :)

On 22/02/2017, at 11:56 am, Tim Vaughan notifications@github.com wrote:

Hi Michael and Alexei, yeah - having this as a command line option makes sense.

As an aside, it would be really helpful to have an actual official BEAST 2 manual. I can't believe I only just thought of this. It summarizes the vague feeling of discontent I've always had with the documentation: there's heaps of it, and it's nicely written, but it's scattered across multiple blog posts and web pages so it's very difficult to navigate. Having a single source for the documentation of the core aspects of BEAST would be awesome. (The book is brilliant, IMO the value there is all of the biological and mathematical context it contains. What I'm talking about is a much shorter technical document/webpage organized according to subject matter.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CompEvol/beast2/issues/95#issuecomment-281509892, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3WSaI36DE8nLOo5Gec3UPf8VtmEKzFks5re2u7gaJpZM4Bw0Jn.

SimonGreenhill commented 7 years ago

Jumping in here. Very much like the manual idea. I'm always sending people to one of the beast2 website pages, but these seem to disappear sometimes, or the formatting goes wonky.

It would be great to have a central place for the various tutorials, FAQs and things centered around simple 'how do I...' or 'what is the best way to do...' and make that user-contributable via github.

A good guide to "Fix an analysis which won't start because of -Inf" alone would save a hundred emails a month from the beast-users mailing list alone.

alexeid commented 7 years ago

We should probably use markdown workflow so that we can publish to web or PDF just as easily?

On 22/02/2017, at 12:40 pm, Simon J Greenhill notifications@github.com wrote:

Jumping in here. Very much like the manual idea. I'm always sending people to one of the beast2 website pages, but these seem to disappear sometimes, or the formatting goes wonky.

It would be great to have a central place for the various tutorials, FAQs and things centered around simple 'how do I...' or 'what is the best way to do...' and make that user-contributable via github.

A good guide to "Fix an analysis which won't start because of -Inf" alone would save a hundred emails a month from the beast-users mailing list alone.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CompEvol/beast2/issues/95#issuecomment-281519083, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3WSbPv-BBTNCo8ugV7QrHtxK_LnJKIks5re3YLgaJpZM4Bw0Jn.

SimonGreenhill commented 7 years ago

Makes sense to use markdown as it's directly viewable on github so you don't even need to publish it, just point people to the repository. And markdown is easy to write and transform into other things (e.g. Pandoc)

tgvaughan commented 7 years ago

Agree markdown is nice. @SimonGreenhill have you seen @laduplessis' site http://taming-the-beast.github.io? The goal there is to have a set of curated core tutorials and a system allowing for easy third-party contributions. The majority of the existing tutorials are written using MD.

tgvaughan commented 7 years ago

Please continue this discussion over at issue #668.