scijava / pom-scijava

Friendly base POM for all SciJava-based software
https://imagej.net/BOM
The Unlicense
25 stars 34 forks source link

Update JHDF5 library to 19.04.0 #181

Closed ctrueden closed 2 years ago

ctrueden commented 3 years ago

Right now we are managing the CISD JHDF5 library at 14.12.6. We would like to upgrade to 19.04.0, to take advantage of new capabilities in HDF5. However, there are two obstacles:

  1. The H5D class was retired in favor of the H5 class from upstream HDF Group's HDF5 Java bindings. Usages of H5D in consumer libraries will need to be updated to their H5-based equivalents.

  2. The JHDF5 library has been deployed with two different groupIds, cisd and ch.systems.cisd, with the SciJava community later agreeing to unify around cisd and stop using the ch.systems.cisd groupId (#87). But some components are still using ch.systems.cisd; fortunately, updating them to 19.04.0 will naturally fix the groupId because we have not deployed 19.04.0 to maven.scijava.org under the old ch.systems.cisd groupId. But we still need to be careful to upgrade everything across the board at once, and/or exclude the coordinate with old groupId from dependency trees in cases where it leaks in.

Here is a table of which components on my radar use JHDF5 in its various incarnations:

ch.systems.cisd:jhdf5

cisd:jhdf5

org.scala-saddle:jhdf5_2.10

The checkboxes here can be checked when a particular component has migrated to JHDF5 19.04.0. And once all checkboxes are checked (except vcell-bioformats, which is probably out of scope here), this issue can be closed.

See also this and that and now also that chat thread.

CC @mkitti

tischi commented 3 years ago

@ctrueden thanks a lot for announcing this! I also have a couple of repos that are using hdf5:

Sorry, I did not get exactly what I need to do? Update/add a dependency in the pom and then change the code accordingly?

mkitti commented 3 years ago

The summary is that as of HDF5 1.10, HDF5 Group now have the Java bindings (JHI5) in the main repository. JHDF5 now builds off those bindings.

HDF5 Group Java HDF5 Interface (JHI5): https://portal.hdfgroup.org/display/HDF5/HDF5+Java+Documentation JavaDoc for hdf.hdf5lib package: (note the lack of ncsa, etc.) https://bitbucket.hdfgroup.org/pages/HDFFV/hdf5doc/master/browse/html/javadoc/index.html?overview-summary.html

Current JHDF5 javadocs and source from ETHZ SIS: https://svnsis.ethz.ch/doc/hdf5/hdf5-19.04/ https://sissource.ethz.ch/sispub/jhdf5/-/tree/19.04.0/source/java/ch/systemsx/cisd/hdf5

Note that the hdf.hdf5lib.exceptions subpackage is documented with JHDF5 and not JHI5 despite the code originally being part of JHI5.

The previous API contained in ch.systemsx.cisd.hdf5.hdf5lib no longer exists. Code referring to classes such as ch.systemsx.cisd.hdf5.hdf5lib.H5D should use hdf.hdf5lib.H5 for low-level calls or higher level interfaces in ch.systemsx.cisd.hdf5.

Edited to clarify hdf.hdf5lib.exceptions origin.

imagesc-bot commented 3 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/hdf5-common-format-for-ilastik-bdv-and-imaris/57860/13

ctrueden commented 2 years ago

@tischi wrote:

Sorry, I did not get exactly what I need to do? Update/add a dependency in the pom and then change the code accordingly?

The first thing to do would be to make sure you are depending on:

<properties>
  <cisd.jhdf5.version>19.04.0</cisd.jhdf5.version>
</properties>
<dependency>
  <groupId>cisd</groupId>
  <artifactId>jhdf5</artifactId>
  <version>${cisd.jhdf5.version}</version>
</dependency>

and not any other version of JHDF5. Once you've made that switch, you are likely to have compile errors around the no-longer-present H5D utility class. I defer to @mkitti's advice above on how best to fix these errors. Perhaps people can comment here with specific broken things and how they fix them, to help each other?

tischi commented 2 years ago

Some experiences:

Here are the changes that I made to imaris-writer.

tischi commented 2 years ago

Probably as expected I am not yet able to actually run the tests in imaris-writer, because it depends on bigdataviewer-core, which is not yet compatible with the new jdhf5.

Screenshot of the HDF5AccessHack class in bigdataviewer-core-10.2.0 from within IntelliJ:

image

ping @tpietzsch

ctrueden commented 2 years ago

CCing folks involved with a repository linked above: @axtimwalde @bogovicj @StephanPreibisch @maarzt @rejsmont @k-dominik @m-novikov @sbesson @dgault @joshmoore. We're trying to migrate the SciJava component collection to a newer HDF5 library. Help, comments, discussion, etc., very welcome.

sbesson commented 2 years ago

Thanks for starting the discussion @ctrueden. Several of us are currently on leave but we'll discuss this next week and get back to you.

As a preliminary outcome, a naive bump of the cisd:jhdf5:14.12.6 library used in Bio-Formats to cisd:jhdf5:19.04.0 results in compilation issues of type

[ERROR] /opt/ome/bioformats/components/formats-bsd/src/loci/formats/in/BDVReader.java:[34,37] package ch.systemsx.cisd.base.mdarray does not exist

Looking briefly at the Javadoc and the source code links pasted by @mkitti above, it looks like this package is indeed no longer shipped as part of the library. Like above, this API might have been retired in favor of the equivalent API in the upstream Java bindings but I have not investigated further.

mkitti commented 2 years ago

Like above, this API might have been retired in favor of the equivalent API in the upstream Java bindings but I have not investigated further.

ch.systemsx.cisd.base.mdarray was never part of JHDF5 as far as I can tell, but may have been a dependency.

The current source for that package is here: https://sissource.ethz.ch/sispub/base/-/tree/master/source/java/ch/systemsx/cisd/base/mdarray

ctrueden commented 2 years ago

ch.systemsx.cisd.base.mdarray was never part of JHDF5 as far as I can tell, but may have been a dependency.

According to a classname search, it was embedded in cisd:jhdf5 through 14.12.6, and is no longer present in cisd:jhdf5:19.04.0, as @sbesson notes.

That same search shows that cisd:base:18.09.0 also contains that package. And the POM of cisd:jhdf5:19.04.0 depends on it.

What surprises me is that @sbesson's "naive bump" to cisd:jhdf5:19.04.0 would yield such compile errors, because cisd:base:18.09.0 should have been brought in also as a transitive dependency.

ctrueden commented 2 years ago

@sbesson I tried updating the JHDF5 version in ome/bioformats as well, and everything builds fine. Here is the patch I used:

diff --git a/components/formats-bsd/pom.xml b/components/formats-bsd/pom.xml
index f8be993e9b..7d10ac168f 100644
--- a/components/formats-bsd/pom.xml
+++ b/components/formats-bsd/pom.xml
@@ -73,6 +73,11 @@
      <artifactId>kryo</artifactId>
      <version>${kryo.version}</version>
     </dependency>
+    <dependency>
+      <groupId>commons-lang</groupId>
+      <artifactId>commons-lang</artifactId>
+      <version>2.6</version>
+    </dependency>
     <dependency>
       <groupId>org.perf4j</groupId>
       <artifactId>perf4j</artifactId>
@@ -100,7 +105,7 @@
     <dependency>
       <groupId>cisd</groupId>
       <artifactId>jhdf5</artifactId>
-      <version>14.12.6</version>
+      <version>19.04.0</version>
     </dependency>
     <dependency>
            <groupId>com.drewnoakes</groupId>

All the formats-bsd tests also still pass (though they take 90 seconds to run).

The commons-lang dependency was being brought in transitively by the old jhdf5, but the new jhdf5 no longer depends on it... but it is an actual dependency of formats-bsd, so should be declared here regardless.

So early signs point to formats-bsd not struggling too much with this update! :+1:

sbesson commented 2 years ago

Thanks @ctrueden, great to see that you managed to compile. The fact that cisd:base:18.09.0 would show up as a dependency was also my expectation.

Looking again, I think there is issue come from the cisd:jhdf5 POM that got downloaded locally. A few years ago, we added a repository in the OME Artifactory mirroring the subset of https://maven.scijava.org/content/groups/public/ for the cisd group and this is the primary repository used when building Bio-Formats.

Looking at the content of my local POM, which was downloaded from this cache rather than upstream Maven Scijava:

(base) sbesson@ls30630:~ $ cat ~/.m2/repository/cisd/jhdf5/19.04.0/jhdf5-19.04.0.pom
<?xml version="1.0" encoding="UTF-8"?>
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <modelVersion>4.0.0</modelVersion>
  <groupId>cisd</groupId>
  <artifactId>jhdf5</artifactId>
  <version>19.04.0</version>
  <description>POM was created by Sonatype Nexus</description>
</project>

differs from the https://maven.scijava.org/content/groups/public/cisd/jhdf5/19.04.0/jhdf5-19.04.0.pom which declares the dependency.

Comparing the timestamps of https://maven.scijava.org/content/groups/public/cisd/jhdf5/19.04.0/ and adds to my confusion. While most of the components were indeed replicated the next day, the POM and others seemed to have been uploaded on Sep 2020. I suspect we will clear and regenerate this cache on our side but I'll look into the logs to figure out what happens.

ctrueden commented 2 years ago

I suspect we will clear and regenerate this cache on our side

:+1:

I'll look into the logs to figure out what happens.

If I had to hazard a guess: perhaps when I first uploaded 19.04.0 I did it directly without a custom POM, then realized it would need to have cisd:base a dependency and deleted and reuploaded it, and your cache was generated during the intervening time. I do not specifically recall that happening, but it seems like the sort of thing I would do.

If you are concerned about more general discrepancies beyond only these components, you could write a script to compare the md5sums across all of your cached components, just to be safe.

sbesson commented 2 years ago

Quick update on this front: the PR mentioned in https://github.com/scijava/pom-scijava/issues/181#ref-pullrequest-1052059999 and bumping the jhdf5 dependency has been merged in the Bio-Formats development line and should be included in the upcoming 6.8.0 release.

mkitti commented 2 years ago

Because jhdf5 is excluded here, I'm not sure if there is anything to do in multiview-reconstruction https://github.com/PreibischLab/multiview-reconstruction/blob/6c6aac841ccab8932810463e87817690e64f89bd/pom.xml#L192-L200

mkitti commented 2 years ago

Packages which do not use JHDF5 directly, ch.systems.cisd:jhdf5

The following repositories do not appear to use jhdf5 directly. Rather their pom.xml files have exclusions:

        <dependency>
            <groupId>ome</groupId>
            <artifactId>formats-gpl</artifactId>
            <scope>runtime</scope>
            <exclusions>
                <exclusion>
                    <groupId>ch.systems.cisd</groupId>
                    <artifactId>jhdf5</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

register_virtual_stack_slices may need some attention though:

Pending pull requests, cisd:jhdf5

mkitti commented 2 years ago

@ctrueden @sbesson are the exclusions for ch.systems.cisd: jhdf5 obsolete? Should we remove all of them? xref: https://github.com/scijava/pom-scijava/issues/87

sbesson commented 2 years ago

@mkitti yes same comment as https://github.com/fiji/register_virtual_stack_slices/issues/13#issuecomment-1027317090, I believe this exclusion is obsolete and can be safely removed. Thanks for driving this!

tpietzsch commented 2 years ago

@ctrueden which version property should we use? You mentioned in a snippet above cisd.jhdf5.version, but pom-scijava manages

    <dependency>
        <groupId>cisd</groupId>
        <artifactId>jhdf5</artifactId>
        <version>${jhdf5.version}</version>
    </dependency>

Should we just override the jhdf5.version property, or is there some reason to use the

<properties>
  <cisd.jhdf5.version>19.04.0</cisd.jhdf5.version>
</properties>
<dependency>
  <groupId>cisd</groupId>
  <artifactId>jhdf5</artifactId>
  <version>${cisd.jhdf5.version}</version>
</dependency>

construct?

ctrueden commented 2 years ago

which version property should we use?

TL;DR: Either one should be fine! Overriding the short-name property will keep the two values aligned, though.

Background: To avoid future name clashes, pom-scijava has switched over to fully qualified version properties for all version management. However, it also defines a short style property as well. For example:

<ui-behaviour.version>2.0.5</ui-behaviour.version>
<org.scijava.ui-behaviour.version>${ui-behaviour.version}</org.scijava.ui-behaviour.version>

And then the actual dependency block uses the fully qualified property.

JHDF5 is no different; this is how it's set up in pom-scijava now.

Caveat emptor: As far as overriding it in downstream POMs goes, it does not matter which one you use. But either way, it is important that all version property overrides in specific POMs be transient. Ideally, a component release should not contain any version property overrides, although I appreciate this this is not always possible. (If a release has a component override to a newer version than its associated BOM, then that newer version will NOT be brought in when depending on that component from another pom-scijava-inheriting project, unfortunately, because the dependency's properties are not inherited—properties only inherit from your parent POM.)

mkitti commented 2 years ago

Probably as expected I am not yet able to actually run the tests in imaris-writer, because it depends on bigdataviewer-core, which is not yet compatible with the new jdhf5.

@tischi BDV-core has now been updated: https://github.com/bigdataviewer/bigdataviewer-core/pull/131

Perhaps your tests on imaris-writer may work now?

tischi commented 2 years ago

Perhaps your tests on imaris-writer may work now?

Thanks, yes, using <bigdataviewer-core.version>10.3.1</bigdataviewer-core.version> works. I guess I will remove this however again from the pom and wait for the corresponding scijava parent pom release?

mkitti commented 2 years ago

@tischi , if your code works either way, then I think that makes sense. If it only works with the newer JHDF5 then, perhaps it is good to keep it for now until the parent pom is updated?

@ctrueden , any thoughts on what to do in the transition?

ctrueden commented 2 years ago

As of 2f7d665290f938d43320c37be411247877a4268a, the entire SciJava component collection is free of ch.systems.cisd:jhdf5 dependencies, and the version of cisd:jhdf5 is now 19.04.0. This gets us much of the way toward the goal of this issue. In particular, it means a new pom-scijava can be released that manages jhdf5 at 19.04.0. Downstream non-SciJava-managed components can then update to the new parent, and fix any JHDF5-related issues as described above, in order to complete their upgrade to 19.04.0.

imagesc-bot commented 2 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/new-bigdataviewer-version-10-4-1/68818/4

imagesc-bot commented 2 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/length-is-too-large-error-in-imagej/62952/5

imagesc-bot commented 2 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/timeline-for-the-next-fiji-update/69640/3

ctrueden commented 2 years ago

See also fiji/HDF5_Vibez#18

ctrueden commented 2 years ago

Today I released pom-scijava 32.0.0, with a harmonious BOM across all JHDF5-related components. We are now on jhdf5 19.04.1.

Any components above which are not managed by pom-scijava are on their own to follow suit. If you want to be part of the update process in future, you can file a PR adding version management of your components to pom-scijava, and your components will become part of the "mega-melt" BOM harmony testing.