ababaian / LIONS

LIONS is a bioinformatic analysis pipeline which brings together a few pieces of software and some home-brewed scripts to annotate a paired-end RNAseq library to detect TE-intiated transcripts
GNU General Public License v3.0
27 stars 13 forks source link

LIONS Output #7

Closed ababaian closed 5 years ago

ababaian commented 5 years ago

Show LIONS Output Example

Comments

Comments


Comments

ababaian commented 5 years ago

Response 1

We have included the LIONS output from both the Hodgkin Lymphoma and B-cell control data sets (Supplementary Table 1); and the ENCODE data sets (Supplementary Table 2). The ENCODE data can be cross referenced with the included UCSC Genome Browser link.

Unfortunately, LIONS is not optimized to perform TE differential expression analysis per se. What can and is quantified is the differential TE-initiation usage in the sets of biological libraries input. LIONS performs estimations of TE-expression which are reported as RepeatRPKM and RepeatMax, but these are for TE-initiation classification. For the differential expression of TEs, we recommend using software specifically designed to this end such as TEtranscripts.

This was a small error, what we meant was that “.lion” is a standardized output format, each column of which is defined in the user manual. It contains data associated with each TE and Exon interaction which was classified by LIONS. Other standard file formats don’t necessarily support this kind of data structure, but we have included a conversion script “lion2bed.sh” which will convert TE-exon interactions into a format which can be easily visualized in a genome browser.

Response 2

An explanation for how RT-PCR validation targets were selected and the rationale for this approach was included as Supplementary Table 1 and additional data was added to Supplementary Figure 3. Briefly, the objective of this experiment was to measure what the fidelity of the LIONS classification procedure was when compared to RT-PCR. We were exactly curious how well RNA-seq would compare to the more familiar RT-PCR since we have access to the same batch of cells from which the RNA-seq was performed. Supplementary Figure 3 {Updates}: Validation Set of Chimeric Transcripts in Hodgkin Lymphoma Cell Lines From the output of LIONS, the top hits for Hodgkin Lymphoma recurrent and specific chimeric transcripts that were detected (see Supplementary Table 1), were validated by reverse transcriptase PCR (RT-PCR). LIONS candidates screened to be likely true-positive events show a high specificity. The concordance between the LIONS predictions and RT-PCR are shown as true positive (dark green), true negative (green), false positive (fuschia), and false negative (pink) bars below the gel images.

Supplementary Table 1

To assess the quality of applying LIONS for the identification of biologically pertinent TE-initiated transcripts specific to a biological group, a set of highly cancer-specific and recurrent transcripts were identified and the existence of these transcripts was tested by reverse transcriptase PCR (RT-PCR). A) LIONS output of Hodgkins Lymphoma cell line RNA-seq (n = 9) and Primary Mediastinal Large B-cell Lymphoma (n = 3) which are recurrent (present in >=2 libraries) and specific (absent from B-cell controls, n = 9). B) Cell line name, accession and library identifiers. C) Top 30 recurrent HL TE-initiation-classified transcripts which intersect a known protein coding gene were manually inspected, 21 were categorized as likely being a true initiation event (green), 4 were categorized as unlikely being a true initiation event (red) and 5 are uncertain (yellow). D) 11 of the likely TE-initiation categorized events were chosen for RT-PCR validation, as well as a THE1B initiated transcript of HBE1, an embryonic globin previously reported to be a cancer-isoform (https://www.ncbi.nlm.nih.gov/pubmed/16620781). E) RT-PCR primers designed to identify specifically the TE-initiated isoforms for each respective target gene.