InPACT is a computational method designed to identify and quantify IPA sites via the examination of contextual sequence patterns and RNA-seq reads alignment. InPACT includes following parts:
InPACT consists of both Python and Bash scripts. A conda virtual environment can be created using the provided environment.yml
file.
Clone the repository:
git clone https://github.com/YY-TMU/InPACT.git
Create the environment:
conda env create -f environment.yml
conda activate InPACT
The installation takes about 5 to 8 minutes. If installation was sucessfull, InPACT command is available:
InPACT -h
Based on the human reference genome (GRCh38), we provided an annotation of potential IPA sites predicted from the sequence module that could be used directly.
In the following link, annotation file for GRCh38 of RefSeq could be downloaded.
In the following link, test file could be downloaded.
The following options are available in this part:
Command
InPACT -i sample.bam -a RefSeq.gtf -s InPACT_polyAsites.hg38.saf -P 5
To assemble novel transcripts, a reference genome in FASTA format and a reference gene annotation in GTF format are required.
Command
InPACT_transcript --predict_terminal predict.result.txt --annotated_gtf RefSeq.gtf --fa_path genome.fa --save_gtf merged.gtf
Salmon is used to index and quantify the transcriptome, and then the usage is calculated.
Command
InPACT_quantify --transcript_tpm quant.sf --annotation_file merged.gtf --ipa_info predict.result.txt --save_file ipa_usage.txt
InPACT takes about an hour to run the test file using five cores. The final output format is as follows:
Column | Description |
---|---|
Terminal exon | Intronic terminal exons for IPA sites |
IPA type | Type of IPA sites (Skipped or composite) |
Gene | Gene symbols |
Upstream coordinate | The 3’ end of the predicted terminal exon’s upstream exon |
PolyAsite | IPA sites |
IPA usage | PAU estimate |