Open martin-g opened 6 months ago
mgrigorov in 🌐 euler-arm-22 in hifiasm on aarch64-support [?] via C v10.3.1-gcc
❯ file hifiasm
hifiasm: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=35b1f9d739f865d1953b27a4bcd53a200ac663b3, for GNU/Linux 3.7.0, with debug_info, not stripped
mgrigorov in 🌐 euler-arm-22 in hifiasm on aarch64-support [?] via C v10.3.1-gcc
❯ ./hifiasm
Usage: hifiasm [options] <in_1.fq> <in_2.fq> <...>
Options:
Input/Output:
-o STR prefix of output files [hifiasm.asm]
-t INT number of threads [1]
-h show help information
--version show version number
Overlap/Error correction:
-k INT k-mer length (must be <64) [51]
-w INT minimizer window size [51]
-f INT number of bits for bloom filter; 0 to disable [37]
-D FLOAT drop k-mers occurring >FLOAT*coverage times [5.0]
-N INT consider up to max(-D*coverage,-N) overlaps for each oriented read [100]
-r INT round of correction [3]
-z INT length of adapters that should be removed [0]
--max-kocc INT
employ k-mers occurring <INT times to rescue repetitive overlaps [2000]
--hg-size INT(k, m or g)
estimated haploid genome size used for inferring read coverage [auto]
Assembly:
-a INT round of assembly cleaning [4]
-m INT pop bubbles of <INT in size in contig graphs [10000000]
-p INT pop bubbles of <INT in size in unitig graphs [0]
-n INT remove tip unitigs composed of <=INT reads [3]
-x FLOAT max overlap drop ratio [0.8]
-y FLOAT min overlap drop ratio [0.2]
-i ignore saved read correction and overlaps
-u post-join step for contigs which may improve N50; 0 to disable; 1 to enable
[1] and [1] in default for the UL+HiFi assembly and the HiFi assembly, respectively
--hom-cov INT
homozygous read coverage [auto]
--lowQ INT
output contig regions with >=INT% inconsistency in BED format; 0 to disable [70]
--b-cov INT
break contigs at positions with <INT-fold coverage; work with '--m-rate'; 0 to disable [0]
--h-cov INT
break contigs at positions with >INT-fold coverage; work with '--m-rate'; -1 to disable [-1]
--m-rate FLOAT
break contigs at positions with <=FLOAT*coverage exact overlaps;
only work with '--b-cov' or '--h-cov'[0.75]
--primary output a primary assembly and an alternate assembly
Trio-partition:
-1 FILE hap1/paternal k-mer dump generated by "yak count" []
-2 FILE hap2/maternal k-mer dump generated by "yak count" []
-3 FILE list of hap1/paternal read names []
-4 FILE list of hap2/maternal read names []
-c INT lower bound of the binned k-mer's frequency [2]
-d INT upper bound of the binned k-mer's frequency [5]
--t-occ INT
forcedly remove unitigs with >INT unexpected haplotype-specific reads;
ignore graph topology; [60]
--trio-dual utilize homology information to correct trio phasing errors
Purge-dups:
-l INT purge level. 0: no purging; 1: light; 2/3: aggressive [0 for trio; 3 for unzip]
-s FLOAT similarity threshold for duplicate haplotigs in read-level [0.75 for -l1/-l2, 0.55 for -l3]
-O INT min number of overlapped reads for duplicate haplotigs [1]
--purge-max INT
coverage upper bound of Purge-dups [auto]
--n-hap INT
number of haplotypes [2]
Hi-C-partition:
--h1 FILEs file names of Hi-C R1 [r1_1.fq,r1_2.fq,...]
--h2 FILEs file names of Hi-C R2 [r2_1.fq,r2_2.fq,...]
--seed INT RNG seed [11]
--s-base FLOAT
similarity threshold for homology detection in base-level;
-1 to disable [0.5]; -s for read-level (see <Purge-dups>)
--n-weight INT
rounds of reweighting Hi-C links [3]
--n-perturb INT
rounds of perturbation [10000]
--f-perturb FLOAT
fraction to flip for perturbation [0.1]
--l-msjoin INT
detect misjoined unitigs of >=INT in size; 0 to disable [500000]
Ultra-Long-integration:
--ul FILEs file names of Ultra-Long reads [r1.fq,r2.fq,...]
--ul-rate FLOAT
error rate of Ultra-Long reads [0.2]
--ul-tip INT
remove tip unitigs composed of <=INT reads for the UL assembly [6]
--path-max FLOAT
max path drop ratio [0.6]; higher number may make the assembly cleaner
but may lead to more misassemblies
--path-min FLOAT
min path drop ratio [0.2]; higher number may make the assembly cleaner
but may lead to more misassemblies
--ul-cut INT
filter out <INT UL reads during the UL assembly [0]
Dual-Scaffolding:
--dual-scaf output scaffolding
--scaf-gap INT
max gap size for scaffolding [3000000]
Example: ./hifiasm -o NA12878.asm -t 32 NA12878.fq.gz
See `https://hifiasm.readthedocs.io/en/latest/' or `man ./hifiasm.1' for complete documentation.
I work on adding Github Actions based CI for Linux ARM64 too!
Successful CI run at my fork: https://github.com/martin-g/hifiasm/actions/runs/8859185274
Many Thanks! I have successfully compiled it on my M1 MacBook Pro.
Jianshu
I can also confirm this branch builds on a Linux Nvidia grace CPU system using make ARCH_FLAGS="-mcpu=neoverse-v2"
. Any chance this will be merged in the near future?
Fixes https://github.com/chhylp123/hifiasm/issues/288
Uses https://github.com/DLTcollab/sse2neon to translate SSE instructions to NEON ones.