galaxyproject / tools-devteam

Contains a set of Galaxy Tools mostly written by the Galaxy Team.
36 stars 92 forks source link

Consistency with Rpy 1 and NumPy packages and tools, e.g. linear_regression #415

Open blankenberg opened 7 years ago

blankenberg commented 7 years ago

These tools and dependency chains are broken. But will work if you have numpy 1.7 available locally. Detailed analysis follows.

Rpy requires NumPy, but the current version of rpy 1.0.3 in the toolshed (https://toolshed.g2.bx.psu.edu/repos/devteam/package_rpy_1_0_3/file/82170c94ca7c/tool_dependencies.xml https://toolshed.g2.bx.psu.edu/view/devteam/package_rpy_1_0_3/82170c94ca7c) does not have a repository dependency defined.

However, the current version in github, https://github.com/galaxyproject/tools-devteam/blob/fbc219afb27b57b7a7aabbf2936439273753664e/packages/package_rpy_1_0_3/tool_dependencies.xml, has a dependency on numpy 1.9, added in fbc219afb27b57b7a7aabbf2936439273753664e.

But I don't think we can or should 'just update' this package in the toolshed, due to inconsistencies when digging into tools.

The linear_regression tool (https://github.com/galaxyproject/tools-devteam/tree/master/tools/linear_regression https://toolshed.g2.bx.psu.edu/view/devteam/linear_regression/cf431604ec3e) has dependencies on rpy 1.0.3, R 2.11.0, and numpy 1.7.

This causes an error with linear_regression (and possibly other tools), when Rpy is built against NumPy 1.9, but given NumPy 1.7:

RuntimeError: module compiled against API version 9 but this version of numpy is 7

I was able to hack the Rpy 1.0.3 package locally to get these tools to work, with this:

shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/package_rpy_1_0_3/82170c94ca7c/package_rpy_1_0_3$ hg diff
diff -r 82170c94ca7c tool_dependencies.xml
--- a/tool_dependencies.xml Tue Apr 01 10:47:51 2014 -0400
+++ b/tool_dependencies.xml Tue Nov 22 10:35:19 2016 -0500
@@ -3,6 +3,9 @@
   <package name="R" version="2.11.0">
       <repository changeset_revision="6e1b17857732" name="package_r_2_11_0" owner="devteam" prior_installation_required="True" toolshed="http://toolshed.g2.bx.psu.edu" />
     </package>
+  <package name="numpy" version="1.7.1">
+      <repository changeset_revision="0c288abd2a1e" name="package_numpy_1_7" owner="devteam" prior_installation_required="True" toolshed="http://toolshed.g2.bx.psu.edu" />
+    </package>
     <package name="rpy" version="1.0.3">
       <install version="1.0">
           <actions>
@@ -15,6 +18,9 @@
                   <repository changeset_revision="6e1b17857732" name="package_r_2_11_0" owner="devteam" prior_installation_required="True" toolshed="http://toolshed.g2.bx.psu.edu">
                       <package name="R" version="2.11.0" />
                     </repository>
+                  <repository changeset_revision="0c288abd2a1e" name="package_numpy_1_7" owner="devteam" prior_installation_required="True" toolshed="http://toolshed.g2.bx.psu.edu">
+                      <package name="numpy" version="1.7.1" />
+                    </repository>
                 </action>
                 <action type="make_directory">$INSTALL_DIR/lib/python</action>
                 <action type="shell_command">

and then uninstalling/reinstalling rpy 1.0.3.

This was the path of least resistance given the current state of the packages in the toolshed (rpy not requiring numpy).

However, the correct fix is most likely to have rpy require numpy (like in current github-only copy, numpy version is up for debate), and simultaneously update any tools using rpy to not manually require their own version of numpy, thereby getting the version inherited through the rpy dependency chain.

blankenberg commented 7 years ago

During a chat, @martenson also pointed out that an other good option is to move to conda only for any effected tools.

pvanheus commented 7 years ago

Is there an (easy) way to discover how many tools would need to be changed? Also how would @martenson's suggestion resolve this? Is the proposal to make linear_regression and other similar tools use conda for dependency resolution?

martenson commented 7 years ago

@pvanheus there is planemo shed_diff, also IUC has better maintainers and auto-upload to MTS on PR's successful tests (which works great with conda-enabled tools)

lparsons commented 7 years ago

Any updates on this? I'm running into this issue.