sneumann / mzR

This is the git repository matching the Bioconductor package mzR: parser for netCDF, mzXML, mzData and mzML files (mass spectrometry data)
40 stars 26 forks source link

mzR installation error with Rtools 4.0 + R-testing #216

Closed hpages closed 4 years ago

hpages commented 4 years ago

Hi Steffen,

In addition to the usual devel builds, we've started to build a small subset of Bioconductor packages using Rtools40 and R-testing:

https://bioconductor.org/checkResults/3.11/bioc-testing-LATEST/

We're hoping that having these experimental builds will help us get ready for when R core decides to officially make the switch to Rtools40.

This is a heads-up that mzR fails to install with Rtools40 + R-testing.

Best, H.

sneumann commented 4 years ago

Hi, ok, let's try to keep track of that here. First superficial assessment is that linking fails with

undefined reference toH5::H5File::H5File()`

Often this was the case in the past if several HDF implementations were around, and headers picked from one, and libhdf.so from the other. We might be lucky and "only" need to fix order of include and lib paths, and not the C++ code itself. I'll have to find a Windows installation where to reproduce. Any help there appreciated. Yours, Steffen

hpages commented 4 years ago

It's interesting to compare with what rhdf5 does at the linking step (rhdf5 also contains C++ code).

rhdf5:

/mingw32/bin/g++ -std=gnu++11 -shared -s -static-libgcc -o rhdf5.dll tmp.def \
    H5.o H5A.o H5D.o H5E.o H5F.o H5G.o H5I.o H5L.o H5O.o H5P.o H5S.o H5T.o H5Z.o H5constants.o HandleList.o HandleListWrap.o h5dump.o h5ls.o h5testLock.o h5writeDataFrame.o printdatatype.o utils.o wrap.o \
    -LC:/Users/BIOCTE~1/BBS-3~1.11-/R/library/Rhdf5lib/lib/i386 -lhdf5_cpp -lhdf5 -lszip -lz -lpsapi \
    -LC:/extsoft/lib/i386 -LC:/extsoft/lib -LC:/Users/BIOCTE~1/BBS-3~1.11-/R/bin/i386 -lR

mzR:

/mingw32/bin/g++ -shared -s -static-libgcc -o mzR.dll tmp.def \
    cramp.o ramp_base64.o ramp.o RcppRamp.o RcppRampModule.o RcppPwiz.o RcppPwizModule.o RcppIdent.o RcppIdentModule.o ./boost/libs/system/src/error_code.o ./boost/libs/regex/src/posix_api.o ./boost/libs/regex/src/fileiter.o ./boost/libs/regex/src/regex_raw_buffer.o ./boost/libs/regex/src/cregex.o ./boost/libs/regex/src/regex_debug.o ./boost/libs/regex/src/instances.o ./boost/libs/regex/src/icu.o ./boost/libs/regex/src/usinstances.o ./boost/libs/regex/src/regex.o ./boost/libs/regex/src/wide_posix_api.o ./boost/libs/regex/src/regex_traits_defaults.o ./boost/libs/regex/src/winstances.o ./boost/libs/regex/src/wc_regex_traits.o ./boost/libs/regex/src/c_regex_traits.o ./boost/libs/regex/src/cpp_regex_traits.o ./boost/libs/regex/src/static_mutex.o ./boost/libs/regex/src/w32_regex_traits.o ./boost/libs/iostreams/src/zlib.o ./boost/libs/iostreams/src/file_descriptor.o ./boost/libs/filesystem/src/operations.o ./boost/libs/filesystem/src/path.o ./boost/libs/filesystem/src/utf8_codecvt_facet.o ./boost/libs/chrono/src/chrono.o ./boost/libs/chrono/src/process_cpu_clocks.o ./boost/libs/chrono/src/thread_clock.o ./pwiz/data/msdata/Version.o ./pwiz/data/identdata/Version.o ./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o ./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o ./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o ./pwiz/data/common/Unimod.o ./pwiz/data/msdata/mz5/Configuration_mz5.o ./pwiz/data/msdata/mz5/Connection_mz5.o ./pwiz/data/msdata/mz5/Datastructures_mz5.o ./pwiz/data/msdata/mz5/ReferenceRead_mz5.o ./pwiz/data/msdata/mz5/ReferenceWrite_mz5.o ./pwiz/data/msdata/mz5/Translator_mz5.o ./pwiz/data/msdata/SpectrumList_MGF.o ./pwiz/data/msdata/DefaultReaderList.o ./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/ChromatogramList_mz5.o ./pwiz/data/msdata/examples.o ./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o ./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_mz5.o ./pwiz/data/msdata/Serializer_MGF.o ./pwiz/data/msdata/Serializer_mzXML.o ./pwiz/data/msdata/SpectrumList_mzML.o ./pwiz/data/msdata/SpectrumList_MSn.o ./pwiz/data/msdata/SpectrumList_mz5.o ./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o ./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o ./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o ./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o ./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o ./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o ./pwiz/data/msdata/MSNumpress.o ./pwiz/data/msdata/SpectrumListCache.o ./pwiz/data/msdata/Index_mzML.o ./pwiz/data/msdata/SpectrumWorkerThreads.o ./pwiz/data/identdata/IdentDataFile.o ./pwiz/data/identdata/IdentData.o ./pwiz/data/identdata/DefaultReaderList.o ./pwiz/data/identdata/Reader.o ./pwiz/data/identdata/Serializer_protXML.o ./pwiz/data/identdata/Serializer_pepXML.o ./pwiz/data/identdata/Serializer_mzid.o ./pwiz/data/identdata/IO.o ./pwiz/data/identdata/References.o ./pwiz/data/identdata/MascotReader.o ./pwiz/data/proteome/Modification.o ./pwiz/data/proteome/Digestion.o ./pwiz/data/proteome/Peptide.o ./pwiz/data/proteome/AminoAcid.o ./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o ./pwiz/utility/chemistry/Chemistry.o ./pwiz/utility/chemistry/ChemistryData.o ./pwiz/utility/chemistry/MZTolerance.o ./pwiz/utility/misc/IntegerSet.o ./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o ./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o ./pwiz/utility/misc/TabReader.o ./pwiz/utility/misc/random_access_compressed_ifstream.o ./pwiz/utility/misc/SHA1.o ./pwiz/utility/misc/SHA1Calculator.o ./pwiz/utility/misc/sha1calc.o ./random_access_gzFile.o ./RcppExports.o ./boost/libs/filesystem/src/path_traits.o ./boost/libs/filesystem/src/windows_file_codecvt.o ./boost/libs/filesystem/src/codecvt_error_category.o ./boost/libs/thread/src/win32/tss_pe.o ./boost/libs/thread/src/win32/tss_dll.o ./boost/libs/thread/src/win32/thread.o ./pwiz/data/msdata/ramp/wglob.o ./boost_aux/boost/nowide/iostream.o rampR.o \
    -lpthread \
    -LC:/Users/BIOCTE~1/BBS-3~1.11-/R/library/Rhdf5lib/lib/i386 -lhdf5_cpp -lhdf5 -lszip -lz -lpsapi -lws2_32 -lz \
    -LC:/extsoft/lib/i386 -LC:/extsoft/lib -LC:/Users/BIOCTE~1/BBS-3~1.11-/R/bin/i386 -lR

There are a few interesting differences. One of them is that rhdf5 uses -std=gnu++11. Could this be related to the problem? I don't think that rhdf5 explicitly sets this flag somewhere so I wonder why mzR doesn't get that flag. My experience with C++ is very limited.

sneumann commented 4 years ago

Well spotted!

I couldn't find a mention of gnu++11 in https://github.com/grimbough/Rhdf5lib/search?q=gnu%2B%2B11 .

Another observation is that for current Rtools Rhdf5lib uses for linking gcc: gcc -shared ... (n.b.: no gnu++11!) http://bioconductor.org/checkResults/devel/bioc-LATEST/Rhdf5lib/tokay2-install.html while mzR on current Rtools uses for linking g++: g++ -shared http://bioconductor.org/checkResults/devel/bioc-LATEST/mzR/tokay2-install.html

Maybe @grimbough has a thought here ? Cytolib has a similar issue: https://bioconductor.org/checkResults/3.11/bioc-testing-LATEST/cytolib/tokay2-install.html Also using /mingw32/bin/g++ -shared ...

While other Rhdf5lib reverse dependencies are fine: https://bioconductor.org/checkResults/3.11/bioc-testing-LATEST/HDF5Array/tokay2-install.html https://bioconductor.org/checkResults/3.11/bioc-testing-LATEST/rhdf5/tokay2-install.html Indeed using g++ (like mzR) nut not gnu++11 (unlike mzR) /mingw64/bin/g++ -std=gnu++11 -shared ...

It is unclear to me as well how that -std=gnu++11 comes about, whether it is part of the Linker definition ($LD) or $PKG_LIBS, but I am sure it is part of the solution. Yours, Steffen

grimbough commented 4 years ago

Another observation is that for current Rtools Rhdf5lib uses for linking gcc: gcc -shared ... (n.b.: no gnu++11!) http://bioconductor.org/checkResults/devel/bioc-LATEST/Rhdf5lib/tokay2-install.html while mzR on current Rtools uses for linking g++: g++ -shared http://bioconductor.org/checkResults/devel/bioc-LATEST/mzR/tokay2-install.html

Rhdf5lib doesn't include any C++ code, so I wouldn't expect it to link with g++, and that flag only seems to appear when g++ is being used.

It is unclear to me as well how that -std=gnu++11 comes about, whether it is part of the Linker definition ($LD) or $PKG_LIBS, but I am sure it is part of the solution. Yours, Steffen

I'm not sure where the -std=gnu++11 is coming from either. I'll boot up a Windows machine and see if I can alter some settings and find out.

One thing I note is that both mzR and cytolib use the boost libraries, could it be possible that this is somehow suppressing the -std=gnu++11 flag? I've no idea, just thinking about differences between those two packages and rhdf5.

grimbough commented 4 years ago

I've read the Using C++11 section of Writing R Extensions a few times now, and I'm not really any clearer as to what's going on, but it feel like the answer this should be found there.

Both mzR and cytolib have SystemRequirements: C++11 in their DESCRIPTION files. If I add this to rhdf5 it removes the -std=gnu++11 during linking:

/mingw64/bin/g++ -shared -s -static-libgcc -o rhdf5.dll tmp.def H5.o H5A.o H5D.o H5E.o H5F.o H5G.o H5I.o H5L.o H5O.o H5P.o H5S.o H5T.o H5Z.o H5constants.o HandleList.o HandleListWrap.o h5dump.o h5ls.o h5testLock.o h5writeDataFrame.o printdatatype.o utils.o wrap.o -LC:/Users/MIKESM~1/DOCUME~1/R/WIN-LI~1/testing/Rhdf5lib/lib/x64 -lhdf5_cpp -lhdf5 -lszip -lz -lpsapi -LC:/PROGRA~1/R/R-TEST~1/bin/x64 -lR

However the compilation still seems to finish without issue.

If I remove the C++11 requirement from mzR then it fails in new and exciting ways (presumably because this really is a requirement):

In file included from pwiz/data/msdata/mz5/Configuration_mz5.hpp:27,
                 from pwiz/data/msdata/mz5/Configuration_mz5.cpp:24:
pwiz/data/msdata/mz5/Datastructures_mz5.hpp:30:10: fatal error: H5Cpp.h: No such file or directory
 #include "H5Cpp.h"
          ^~~~~~~~~
compilation terminated.

Edit: The error above was due to a mess up on my part with an old version of mzR, it fails to compile in exactly the same way regardless of the C++11 requirement

hpages commented 4 years ago

Thx @grimbough for looking into this.

What about the extern "C" thing used in rhdf5/src/HandleList.cpp but not in any of the .cpp files in mzR/src/pwiz/data/msdata/mz5/. Could that be related to the problem?

sneumann commented 4 years ago

I wouldn't delve too far into the code of the affected packages without knowing what to look for. Any chance to ask the R and especially Rtools 4.0 people for opinions ? Our main question is, what is the Make rule selecting the linker and linker flags. I tried in the past to get decent debugging on this make-jungle in R, and never figured that out. Yours, Steffen

grimbough commented 4 years ago

What about the extern "C" thing used in rhdf5/src/HandleList.cpp but not in any of the .cpp files in mzR/src/pwiz/data/msdata/mz5/. Could that be related to the problem?

In all honesty I have no idea why those bits of code are written in C++ or use the style they do. They were written before my time as maintainer and I've touched that code only very briefly.

However, I note that they don't at any point seem to use the HDF5 C++ API - there's no requirement for H5Cpp.h or anything like that. Despite what appears in the Makevars, rhdf5 actually compiles fine if it doesn't link to libhdf5_cpp.dll at all.

I wonder if this is actually a manifestation of the problem I was hunting and didn't find prior to the last devel call. Maybe I need to provide libhdf5_cpp.dll compiled with the new toolchain, but that's not necessary for the plain old libhdf5.dll, so my testing with rhdf5 didn't find it.

hpages commented 4 years ago

interesting... so maybe one way to find out is to add some toy .cpp file to rhdf5 that actually uses the HDF5 C++ API?

grimbough commented 4 years ago

Yep, that breaks it with the same undefined reference to H5::.... style errors. I'll build libhdf5_cpp.dll locally and see if that fixes it. Assuming it does I'll think about whether to handle this myself or try switching to the R-winlib distribution of HDF5.

hpages commented 4 years ago

Sounds good. Thx Mike!

grimbough commented 4 years ago

I've pushed a modified version of Rhdf5lib to BioC devel, which should detect which tool chain has been used to build R, and copy the appropriate versions of the HDF5 libraries. It could probably do with some tidying up, but lets see if it fixed the build issue for mzR. It seems to work locally for me.

sneumann commented 4 years ago

Hi, I can confirm that https://bioconductor.org/checkResults/3.11/bioc-testing-LATEST/mzR/tokay2-install.html build is now green (plus purple timeout) in CHECK, while cytolib still has linking issues. Yours, Steffen

grimbough commented 4 years ago

Great. It looks like the cytolib error is now only for linking to libprotobuf-lite.a, I guess that also needs to be rebuilt with the new toolchain.

It also seems to have introduced a new and unexpected error into rhdf5, but I'll save that for another issue.

hpages commented 4 years ago

Problem solved. Thx Mike!

(FWIW Mike Jiang knows about cytolib failing to install with Rtools40 + R-testing and is working on it.)