ishare
ishare
is a Rust crate designed to facilitate the analysis of rare-variant
sharing and identity-by-descent (IBD) sharing.
Currently, it's in pre-alpha stage.
This crate introduces a data structure and its corresponding algorithm for the tabular encoding of rare variant genotype data. The tabular encoding leverages the sparsity of minor alleles at sites with low minor allele frequencies, effectively transforming a large site-oriented genotype matrix into a more compact, manipulable, and accessible data structure. As a result, this design enhances disk IO speeds, facilitates in-memory access to rare variants, and permits large-scale rare variant sharing analysis.
gtencode
isharepy
Included in this crate are the data structure and algorithms associated with IBD
sharing analysis. It serves as a reimplementation of functionalities present in
the Python package ibdutils
and the C++
library ibdtools
. Key updates
from the previous implementations include:
Use of genome-wide base-pair coordinates for IBD segment start and end fields, as opposed to chromosomal positions. This facilitates easier IBD segment manipulation across chromosomes.
The itervaltree
crate has
been integrated to improve various IBD processing and analysis functionalities,
enhancing code maintainability and readability.
Command Line Tool: ibdutils
Python Package (work in progress): isharepy
for IBD-related classes.
We've introduced a command line tool that integrates local ancestry information
from rfmix
v2 with IBD segments to derive
ancestry-specific IBD. Refer to the binary asibd
for this feature.
rust-htslib
dependency.cargo
and the necessary toolchain are installed to compile
this crate and its associated binary executables.cargo build --release --bin gtencode
cargo build --release --bin ibdutils
cargo build --release --bin asibd
Post-compilation, binaries are located in the target/release/ directory.
cargo
, the necessary toolchain, c compilerare, python 3 are installedpip
to install maturin
, numpy
and pyarrow
pip install maturin numpy pyarrow
isharepy
cd ishare_py
maturin develop --release
cd ..
Instructions for each command line tool are as follows:
toolname --help
: Lists available subcommands. E.g., gtencode --help
.toolname subcommand --help
: Displays usage for a specific subcommand.
E.g., gtencode encode --help
.Running tests requires cloning test data from https://github.com/bguo068/testdata
,
which has been added as a submodule of ishare
and thus can downloaded by running:
git submodule update --init --recursive