torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
643 stars 123 forks source link

Windows binaries: working with compressed files #520

Closed frederic-mahe closed 2 months ago

frederic-mahe commented 1 year ago

@GuilhemSempere reported that the vsearch Windows binaries do not accept compressed files by default and require additional dlls.

Indeed, on a fresh and up-to-date Windows 10 install, trying to read a compressed gzip or bzip2 fasta file with any vsearch release (v2.3.4 to present) returns an error message:

Fatal error: Files compressed with gzip are not supported

or

Fatal error: Files compressed with bzip2 are not supported

Installing git (x64), searching C:\Program Files\Git for zlib1.dll, and copying the library to C:\Windows\System32 allows to process gzip'ed files.

However, downloading libbz2.dll (x64 or x86) and putting libbz2.dll either in C:\Windows\System32 or C:\Windows\SysWOW64 does not work. When putting libbz2.dll (x64) in C:\Windows\System32, vsearch does not complain but the output file is empty (silent failure). All other combinations trigger vsearch's bzip2 error message.

Note: renaming libbz2.dll into bz2.dll and running vsearch also triggers an error message:

Fatal error: Files compressed with bzip2 are not supported

The README file should be modified to reflect that name change (libbz2.dll rather than bz2.dll).

torognes commented 1 year ago

See also issue #412.

Please note that information about the supported compressed file types is shown with vsearch --version.

frederic-mahe commented 1 year ago

See also issue #412.

Please note that information about the supported compressed file types is shown with vsearch --version.

For the record, in my last test, after trying to install both libraries on a Windows system, vsearch --version returns the following output:

vsearch v2.22.1_win_x86_64, 15.8GB RAM, 2 cores
https://github.com/torognes/vsearch

Rognes T, Flouri T, Nichols B, Quince C, Mahe F (2016)
VSEARCH: a versatile open source tool for metagenomics
PeerJ 4:e2584 doi: 10.7717/peerj.2584 https://doi.org/10.7717/peerj.2584

Compiled with support for gzip-compressed files, and the library is loaded.
zlib version 1.2.13, compile flags 65
Compiled with support for bzip2-compressed files, and the library is loaded.

Reminder: reading gzip files is ok, but bz2 files fail silently.

I'll try to update the documentation. @torognes Maybe we should include known good libraries in our vsearch release for Windows?

astanabe commented 1 year ago

I extracted libbz2-1.dll from mingw-w64-x86_64-bzip2-1.0.8-2-any.pkg.tar.zst which was downloaded from the following URL. https://packages.msys2.org/package/mingw-w64-x86_64-bzip2?repo=mingw64

After extraction, I renamed libbz2-1.dll to libbz2.dll and move it to the same folder as vsearch.exe. Then, vsearch can read fastq.bz2 correctly.

torognes commented 6 months ago

The vsearch executable for Windows in the current release (2.26.1) is not compiled with support for bz2-compressed files, only for gz-compressed files. I will include support for that in the next release. I will also include the two required DLL's in the next release.

The two DLL's can be downloaded using the following commands:

wget -O - "https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-zlib-1.3-1-any.pkg.tar.zst" | tar xvf - --zstd -O mingw64/bin/zlib1.dll > ./zlib1.dll

wget -O - "https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-bzip2-1.0.8-3-any.pkg.tar.zst" | tar xvf - --zstd -O mingw64/bin/libbz2-1.dll > ./libbz2.dll
torognes commented 5 months ago

In the release of vsearch version 2.27.0, the DLL's for the two compression libraries are included in the distribution.