openvax / neoantigen-vaccine-pipeline

Bioinformatics pipeline for selecting patient-specific cancer neoantigen vaccines
Apache License 2.0
76 stars 25 forks source link

Prepare Kallisto quantification for CIBERSORT #105

Open iskandr opened 6 years ago

iskandr commented 6 years ago

CIBERSORT web interface: https://cibersort.stanford.edu/

iskandr commented 6 years ago

Use tximport to add up transcript level quantification to get per-gene values.

julia326 commented 6 years ago

Use something like this run script to get transcript quantification:

#!/bin/bash

set -e
set -x

mkdir -p kallisto

KALLISTO_INDEX=/mnt/md0/tim/Dropbox/sinai/big-data/references/Homo_sapiens.GRCh38.cdna.all.fa.gz.index
kallisto quant --index=$KALLISTO_INDEX --threads 4  --output-dir=kallisto vaccine-data/rna-fastq/*.fastq.gz
timodonnell commented 6 years ago

R code I used to get gene-level quantifications for CIBERSORT:

source("https://bioconductor.org/biocLite.R")
biocLite("tximport")
mart <- biomaRt::useMart(
    biomart = "ENSEMBL_MART_ENSEMBL", dataset ="hsapiens_gene_ensembl", host = 'useast.ensembl.org')
t2g <- biomaRt::getBM(
     attributes = c("ensembl_transcript_id", "ensembl_gene_id", "external_gene_name"), mart = mart, verbose = 92)
mapping <- dplyr::select(t2g, c("ensembl_transcript_id", "external_gene_name"))
tx <- tximport::tximport("abundance.h5", type="kallisto", tx2gene = mapping, ignoreTxVersion=T)
write.csv(tx, "abundance.gene.common_names.csv")
timodonnell commented 6 years ago
# Kallisto index generation
kallisto index -i Homo_sapiens.GRCh38.cdna.all.fa.gz.index Homo_sapiens.GRCh38.cdna.all.fa.gz