Closed jiaan-yu closed 2 months ago
I also had to recompile abpoa to get it running on my institution's compute cluster. Might be nice if the default CPU flags were compatible with a wider range of CPUs (or even if the README just mentioned this pitfall and specifically which cpu features were enabled in the default binary builds). The bioconda/pypi binaries should probably not be build with -march=native
, because then the binaries are completely dependent on the cpus that happen to be used for the github action runners doing the build. Explicitly specifying a reasonable list of common cpu features in the bioconda/pypi build scripts should address this.
I see the issues here.
@shenker
Explicitly specifying a reasonable list of common cpu features in the bioconda/pypi build scripts
Do you have any specific suggestions? I am not an expert in doing this. If you can provide some examples, I can try to make the change in the scripts.
I found some relevant information about the use of -march=native
, if you would find it helpful.
https://stackoverflow.com/questions/52653025/why-is-march-native-used-so-rarely
https://lemire.me/blog/2018/07/25/it-is-more-complicated-than-i-thought-mtune-march-in-gcc/
You could do -march=sandybridge
. It is a reasonable old architecture. In PGGB we have made positive experiences so far https://github.com/pangenome/pggb/blob/81efdfe0c2accf02f899947d179fef41ce20f8b9/Dockerfile#L74.
There are 2 other ways, leading to much better abPOA performance:
-msse4.1
, -mavx
, ...). And -march=generic
. Then you can combine the binaries using a bash script. Detecting the host architecture on the fly.If I recall correctly conda packages are typically built with -march=haswell
, the generation after sandybridge. This gives you access to SSE4.2 and AVX2, but not AVX512. Haswell is more than 10 years old at this point, so if your don't have SSE4.2 at this point you should probably get a newer computer!
Hi Yan,
We are trying to run abPOA (with pre-built binary) on cloud . But we found this binary didn't work between some versions of Intel CPUs, or between AMD and Intel CPUs. Is there a way to by-pass this issue? We are considering compling the binary every single time before we launch jobs in cloud, but that's not very elegant :(
Cheers, Jiaan