ucscGenomeBrowser / kent

UCSC Genome Browser source tree. Stable branch: "beta".
http://genome.ucsc.edu/
Other
219 stars 89 forks source link

Provide Linux aarch64 binary for genePredToGtf #75

Closed martin-g closed 1 year ago

martin-g commented 1 year ago

Describe the bug

Bioconductor metaseqR2 package needs a Linux aarch64 binary of genePredToGtf

To Reproduce Steps to reproduce the problem:

  1. git clone https://github.com/pmoulos/metaseqR2.git
  2. R CMD build ./metaseqR2

If the above steps are executed on Linux aarch64 the build will fail with the following error:

...
sh: 1: /tmp/Rtmp5jSbBD/test_custom/genePredToGtf: Exec format error
Quitting from lines 190-248 (metaseqr2-annotation.Rmd) 
Error: processing vignette 'metaseqr2-annotation.Rmd' failed with diagnostics:
cannot open the connection
--- failed re-building ‘metaseqr2-annotation.Rmd’

This is because metaseq2 downloads the x86_64 binary from http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/genePredToGtf

Expected behavior

Provide http://hgdownload.soe.ucsc.edu/admin/exe/linux.aarch64 so that the users can download an aarch64 flavor of the tool.

Screenshots N/A

Desktop (please complete the following information):

I have contacted the mailing list a week ago but so far there was no reply.

Thank you!

genome-www commented 1 year ago

Thanks Martin, we did see your email but haven't gotten back to you yet, as we're discussing how best to do this. This is an entirely new request for us. I have never heard of anyone who is running linux on aarch64, are you doing that yourself? I guess that means that you're running it on an M1/M2 Apple laptop, natively? Is this something that more people are doing?

We are not running linux on our macbooks but OSX. I imagine we could cross-compile on linux for aarch64 using GCC, but I really wonder if there is any research institute running servers with linux on the M1s, so whether this is worth the time.

There is a conda package for this particular tool, couldn't the conda package be used on your special Apple/Linux combination to build the tool? https://anaconda.org/bioconda/ucsc-genepredtogtf

Message ID: @.***>

martin-g commented 1 year ago

Hello! Thank you for the quick reply!

I am talking about Linux on ARM64, not specifically Apple's hardware. Indeed, Apple made ARM64/aarch64 more known to the users via their M1/M2 laptops and PC (Studio)! But there are many other ARM64 hardware providers!

I work for one of the cloud providers and I can say that the demand for Linux ARM64 deployments increases steadily in the last few years! All of the major cloud providers started offering Linux ARM64 compute instances in addition to the "standard" Linux x86_64 ([1], [2], [3], [4], [5], [6]). Only IBM offers its own s390x [7] in addition to x86_64.

  1. https://aws.amazon.com/ec2/instance-types/ (A1)
  2. https://cloud.google.com/compute/docs/instances/arm-on-compute (Tau T2A)
  3. https://azure.microsoft.com/en-us/blog/azure-virtual-machines-with-ampere-altra-arm-based-processors-generally-available/
  4. https://www.oracle.com/cloud/compute/arm/
  5. https://www.alibabacloud.com/product/ecs?spm=a3c0i.7938564.8215766810.5.20ff441eQxRIsn
  6. https://support.huaweicloud.com/intl/en-us/productdesc-ecs/en-us_topic_0035470096.html (Kunpeng)
  7. https://cloud.ibm.com/vpc-ext/provision/vs

Unfortunately Bioconda also does not provide support for linux-aarch64. I and some other members of the community started working on it - https://github.com/bioconda/bioconda-utils/issues/706. But it will take its time!

About the build: if you don't have access to ARM64 hardware (with Linux) then you can do it either via cross compilation or via emulation. E.g. in this article I explain how to do it with QEMU. Or you could use a free VM from a cloud provider, e.g. Oracle Cloud. Or you could use a CI (Continuous Integration) service to build it as a part of the release process, e.g. here is how we did it for Ninja Build tool. Since Github Actions does not provide Linux ARM64 cloud-hosted builders it uses QEMU via the uraimo/run-on-arch-action action. But there are other CI providers which provide Linux ARM64 natively, e.g. CircleCI and CirrusCI.

Please let me know if you are interested in any of the above options and I will try to help as much as I can!

genome-www commented 1 year ago

Thanks! OK.

Are you sure that one needs Qemu? I thought the definition of cross-compiling was that you don't need the CPU architecture. Here is a tutorial that just runs gcc on Intel to build ARM binaries, unless I misunderstood something: https://jensd.be/800/linux/cross-compiling-for-arm-with-ubuntu-16-04-lts

On Tue, Mar 14, 2023 at 1:24 PM 'Martin Grigorov' via UCSC Genome Browser Confidential Support @.***> wrote:

Hello! Thank you for the quick reply!

I am talking about Linux on ARM64, not specifically Apple's hardware. Indeed, Apple made ARM64/aarch64 more known to the users via their M1/M2 laptops and PC (Studio)! But there are many other ARM64 hardware providers!

I work for one of the cloud providers and I can say that the demand for Linux ARM64 deployments increases steadily in the last few years! All of the major cloud providers started offering Linux ARM64 compute instances in addition to the "standard" Linux x86_64 ([1], [2], [3], [4], [5], [6]). Only IBM offers its own s390x [7] in addition to x86_64.

  1. https://aws.amazon.com/ec2/instance-types/ (A1)
  2. https://cloud.google.com/compute/docs/instances/arm-on-compute (Tau T2A)
  3. https://azure.microsoft.com/en-us/blog/azure-virtual-machines-with-ampere-altra-arm-based-processors-generally-available/

  4. https://www.oracle.com/cloud/compute/arm/
  5. https://www.alibabacloud.com/product/ecs?spm=a3c0i.7938564.8215766810.5.20ff441eQxRIsn

  6. https://support.huaweicloud.com/intl/en-us/productdesc-ecs/en-us_topic_0035470096.html (Kunpeng)

  7. https://cloud.ibm.com/vpc-ext/provision/vs

Unfortunately Bioconda also does not provide support for linux-aarch64. I and some other members of the community started working on it - bioconda/bioconda-utils#706 https://github.com/bioconda/bioconda-utils/issues/706. But it will take its time!

About the build: if you don't have access to ARM64 hardware (with Linux) then you can do it either via cross compilation or via emulation. E.g. in this article https://martin-grigorov.medium.com/building-linux-packages-for-different-cpu-architectures-with-docker-and-qemu-d29e4ebc9fa5 I explain how to do it with QEMU. Or you could use a free VM from a cloud provider, e.g. Oracle Cloud https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a . Or you could use a CI (Continuous Integration) service to build it as a part of the release process, e.g. here https://github.com/ninja-build/ninja/blob/master/.github/workflows/linux.yml#L151-L209 is how we did it for Ninja Build tool. Since Github Actions does not provide Linux ARM64 cloud-hosted builders it uses QEMU via the uraimo/run-on-arch-action action. But there are other CI providers which provide Linux ARM64 natively, e.g. CircleCI and CirrusCI.

Please let me know if you are interested in any of the above options and I will try to help as much as I can!

— Reply to this email directly, view it on GitHub https://github.com/ucscGenomeBrowser/kent/issues/75#issuecomment-1468010845, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQIUREECT44AWEKXJG7POXLW4BPQZANCNFSM6AAAAAAV2FPWTQ . You are receiving this because you commented.Message ID: @.***>

martin-g commented 1 year ago

I said ... either via cross compilation or via emulation :-) Following the article should be fine! Just you need to use gcc-aarch64-linux-gnu instead of gcc-arm-linux-gnueabi. Same for binutils-aarch64-linux-gnu

maximilianh commented 1 year ago

Sorry, yes you did say either compile or emulation. I mis-read.

For cross-compilation we need all our libraries built in that other format, too. We need to build ourlselves libpng, zlib, hdf5, mysql, freetype and many others and their dependencies on Centos. For qemu, we'd have to set up this entire qemu system as part of our build. It would take an engineer probably a few days.

We have noted the request, but this is the first time that someone has requested this and of the few people in other institutes that I asked, no one is using ARM CPUs, except on laptops and on OSX. So I think we postpone this for now and see if we get more requests like this and then revisit the issue.

Currently, our binaries don't run on all linux versions because they reference dynamic libraries that don't exist everywhere. Providing binaries always means not supporting a few platforms.

That being said, I've prepared last year a git repo with just the minimal libraries for our command line tools, and without the dependencies. The aim is to make building the command line tools simpler. Do you think that could help here?

martin-g commented 1 year ago

Thanks for the answer, @maximilianh ! I understand your point of view and I think it is fair!

That being said, I've prepared last year a git repo with just the minimal libraries for our command line tools, and without the dependencies. The aim is to make building the command line tools simpler. Do you think that could help here?

Could you please share a link to this repo ? I will try to build genePredToGtf locally for the Bioconductor's build reports needs.

maximilianh commented 1 year ago

The repo is here: https://github.com/ucscGenomeBrowser/kent-core

(it's two years old code, but that should not do any harm, the command line tools change very little and I can update the repo later by merging in the current code)

My aim is that "make" in there builds the tools into kent-core/bin. This works on my own linux box, but may not work on yours (I may have forgotten a dependency... or how to handle them). If you send me an error message, I'll try to fix it asap. Alternatively, I can also run this on a fresh Ubuntu (?) VM or whatever other linux you suggest using as a good base system...

On Wed, Mar 15, 2023 at 10:13 AM Martin Grigorov @.***> wrote:

Thanks for the answer, @maximilianh https://github.com/maximilianh ! I understand your point of view and I think it is fair!

That being said, I've prepared last year a git repo with just the minimal libraries for our command line tools, and without the dependencies. The aim is to make building the command line tools simpler. Do you think that could help here?

Could you please share a link to this repo ? I will try to build genePredToGtf locally for the Bioconductor's build reports needs.

— Reply to this email directly, view it on GitHub https://github.com/ucscGenomeBrowser/kent/issues/75#issuecomment-1469629508, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TNBTPHSVG477ERMUF3W4GB4TANCNFSM6AAAAAAV2FPWTQ . You are receiving this because you were mentioned.Message ID: @.***>

maximilianh commented 1 year ago

Tried on the fresh vagrant Ubuntu 20.02 VM. Only had to run this command:

sudo apt install make gcc g++ libpng-dev uuid-dev libmariadbclient-dev

It worked and built the binaries into bin/

Also tested the kent-core "make" on my OSX and it worked there (with all the installed packages that I have there).

Shall I test something else?

martin-g commented 1 year ago

No need to test anything else for now! I am finishing other task and then I will switch to this one. I will report any issues if I find such! I will close this issue for now! Thank you for your help, @maximilianh !

maximilianh commented 1 year ago

I've updated the kent-core to the current release and added notes to the README.

julien-faye commented 1 year ago

Hi! Apologies for commenting on a closed issue!

I also need genePredToGtf binary for Linux ARM64 and I was able to build the project locally with this minor workaround:

diff --git a/src/lib/htmshell.c b/src/lib/htmshell.c
index bf39ebf..ebeb9e0 100644
--- a/src/lib/htmshell.c
+++ b/src/lib/htmshell.c
@@ -713,11 +713,13 @@ void htmlVaBadRequestAbort(char *format, va_list args)
 puts("Status: 400\r");
 puts("Content-Type: text/plain; charset=UTF-8\r");
 puts("\r");
+/*
 if (format != NULL && args != NULL)
     {
     vfprintf(stdout, format, args);
     fprintf(stdout, "\n");
     }
+*/
 exit(-1);
 }

because otherwise it fails with:

make[2]: Entering directory '/home/biocbuild/git/kent-core/src/lib'
cc -O -g -std=c99 -Wall -Wformat -Wimplicit -Wreturn-type -Wuninitialized -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_GNU_SOURCE -DMACHTYPE_aarch64   -Wall -Wformat -Wimplicit -Wreturn-type -Wuninitialized -I../inc -I../../inc -I../../../inc -I../../../../inc -I../../../../../inc -I../htslib -I/usr/include/freetype2 -I/usr/include/libpng16 -DUSE_FREETYPE -I/include -I/usr/include/libpng16  -o htmshell.o -c htmshell.c
htmshell.c: In function ‘htmlVaBadRequestAbort’:
htmshell.c:716:28: error: invalid operands to binary != (have ‘va_list’ and ‘void *’)
  716 | if (format != NULL && args != NULL)
      |                            ^~
make[2]: *** [../inc/common.mk:536: htmshell.o] Error 1

The project builds fine on RockyLinux 9 aarch64! But there is no genePredToGtf in ./bin/ . Do I miss some step ?

gerardoPerez1 commented 1 year ago

Hello @julien-faye,

By default, our executables are put in ~/bin/$(uname -m)/ not ./bin/. This can be changed by setting the environment variable BINDIR to something else. If you run 'make' in kent/src/hg/genePredToGtf then the output should show where the executable is placed.

Also, instead of commenting out the whole if ... clause, it should be sufficient to change this:

if (format != NULL && args != NULL)

to this:

if (format != NULL)

julien-faye commented 1 year ago

Thank you for the help, @gerardoPerez1 !

Actually I was building https://github.com/ucscGenomeBrowser/kent-core, as recommended by @maximilianh ! There is no bin/aarch64/ in it, just flat bin/ but without genePredGtf in it.

Let me try to build the kent project!

julien-faye commented 1 year ago

I was able to build it:

/home/julien/bin/aarch64/genePredToGtf: ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=17659cad2d51cf694bb82248a18b719232a73364, for GNU/Linux 3.7.0, with debug_info, not stripped

Thank you, @gerardoPerez1 !

maximilianh commented 1 year ago

Hi Julien, right, you're-using kent-core, sorry I forgot. You are right, there was a bug in the makefile that prevented it from building genePredToGtf by default, sorry for that! I've made the change now to the makefile and the file should be built with kent-core on the next release of the source code.

On Wed, Aug 16, 2023 at 9:38 AM julien-faye @.***> wrote:

I was able to build it:

/home/julien/bin/aarch64/genePredToGtf: ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=17659cad2d51cf694bb82248a18b719232a73364, for GNU/Linux 3.7.0, with debug_info, not stripped

Thank you, @gerardoPerez1 https://github.com/gerardoPerez1 !

— Reply to this email directly, view it on GitHub https://github.com/ucscGenomeBrowser/kent/issues/75#issuecomment-1680113966, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TLO3C3KQY4QEP6UA23XVR2GXANCNFSM6AAAAAAV2FPWTQ . You are receiving this because you were mentioned.Message ID: @.***>

julien-faye commented 1 year ago

Thanks for the update, @maximilianh !

matthewspeir commented 11 months ago

Hello!

Can you share the specific errors you're seeing for twoBitToFa, faToTwoBit, and twoBitInfo during compilation? And share your gcc version?

You can send your question/response to genome-www@soe.ucsc.edu if you don't want to share the details in this Github issue.

(Seems that there was a question here regarding those tools. Although it has since been deleted, so hopefully they see this.)