Open iskandr opened 6 years ago
Use tximport to add up transcript level quantification to get per-gene values.
Use something like this run script to get transcript quantification:
#!/bin/bash
set -e
set -x
mkdir -p kallisto
KALLISTO_INDEX=/mnt/md0/tim/Dropbox/sinai/big-data/references/Homo_sapiens.GRCh38.cdna.all.fa.gz.index
kallisto quant --index=$KALLISTO_INDEX --threads 4 --output-dir=kallisto vaccine-data/rna-fastq/*.fastq.gz
R code I used to get gene-level quantifications for CIBERSORT:
source("https://bioconductor.org/biocLite.R")
biocLite("tximport")
mart <- biomaRt::useMart(
biomart = "ENSEMBL_MART_ENSEMBL", dataset ="hsapiens_gene_ensembl", host = 'useast.ensembl.org')
t2g <- biomaRt::getBM(
attributes = c("ensembl_transcript_id", "ensembl_gene_id", "external_gene_name"), mart = mart, verbose = 92)
mapping <- dplyr::select(t2g, c("ensembl_transcript_id", "external_gene_name"))
tx <- tximport::tximport("abundance.h5", type="kallisto", tx2gene = mapping, ignoreTxVersion=T)
write.csv(tx, "abundance.gene.common_names.csv")
# Kallisto index generation
kallisto index -i Homo_sapiens.GRCh38.cdna.all.fa.gz.index Homo_sapiens.GRCh38.cdna.all.fa.gz
CIBERSORT web interface: https://cibersort.stanford.edu/