uclahs-cds / package-moPepGen

Multi-Omics Peptide Generator
https://uclahs-cds.github.io/package-moPepGen/
GNU General Public License v2.0
6 stars 1 forks source link

`--backsplicing-only` added to only call backsplicing site spanning peptides #859

Closed zhuchcn closed 8 months ago

zhuchcn commented 8 months ago

Description

Flag --backsplicing-only is added to only output noncanonical peptides from circRNA that are spanning backsplicing sites.

Below is an example of the peptide table. For every peptide, there is at least a pair of adjacent segments that the first has larger gene coordinates than the second, meaning it is spanning the backsplicing site.

#sequence             header                          subsequence       start  end  feature_type  feature_id          ref_start  ref_end  start_offset  end_offset  variant
MKPLVVDISER           CIRC-ENST00000614167.2-E2-E1|1  MKPL              0      4    gene          ENSG00000128408.9   393        405      0             0           CIRC-ENST00000614167.2-E2-E1
MKPLVVDISER           CIRC-ENST00000614167.2-E2-E1|1  VVDISER           4      11   gene          ENSG00000128408.9   0          21       0             0           CIRC-ENST00000614167.2-E2-E1
KPLVVDISER            CIRC-ENST00000614167.2-E2-E1|2  KPL               0      3    gene          ENSG00000128408.9   396        405      0             0           CIRC-ENST00000614167.2-E2-E1
KPLVVDISER            CIRC-ENST00000614167.2-E2-E1|2  VVDISER           3      10   gene          ENSG00000128408.9   0          21       0             0           CIRC-ENST00000614167.2-E2-E1
MKPLVVDISERAGASVPLR   CIRC-ENST00000614167.2-E2-E1|3  MKPL              0      4    gene          ENSG00000128408.9   393        405      0             0           CIRC-ENST00000614167.2-E2-E1
MKPLVVDISERAGASVPLR   CIRC-ENST00000614167.2-E2-E1|3  VVDISERAGASVPLR   4      19   gene          ENSG00000128408.9   0          45       0             0           CIRC-ENST00000614167.2-E2-E1
KPLVVDISERAGASVPLR    CIRC-ENST00000614167.2-E2-E1|4  KPL               0      3    gene          ENSG00000128408.9   396        405      0             0           CIRC-ENST00000614167.2-E2-E1
KPLVVDISERAGASVPLR    CIRC-ENST00000614167.2-E2-E1|4  VVDISERAGASVPLR   3      18   gene          ENSG00000128408.9   0          45       0             0           CIRC-ENST00000614167.2-E2-E1
MKPLVVDISERAGASVPLRR  CIRC-ENST00000614167.2-E2-E1|5  MKPL              0      4    gene          ENSG00000128408.9   393        405      0             0           CIRC-ENST00000614167.2-E2-E1
MKPLVVDISERAGASVPLRR  CIRC-ENST00000614167.2-E2-E1|5  VVDISERAGASVPLRR  4      20   gene          ENSG00000128408.9   0          48       0             0           CIRC-ENST00000614167.2-E2-E1
KPLVVDISERAGASVPLRR   CIRC-ENST00000614167.2-E2-E1|6  KPL               0      3    gene          ENSG00000128408.9   396        405      0             0           CIRC-ENST00000614167.2-E2-E1
KPLVVDISERAGASVPLRR   CIRC-ENST00000614167.2-E2-E1|6  VVDISERAGASVPLRR  3      19   gene          ENSG00000128408.9   0          48       0             0           CIRC-ENST00000614167.2-E2-E1
MPFRRSTSGPSK          CIRC-ENST00000642151.1-E1-E2|1  MPFR              0      4    gene          ENSG00000099949.21  130        142      0             0           CIRC-ENST00000642151.1-E1-E2
MPFRRSTSGPSK          CIRC-ENST00000642151.1-E1-E2|1  RRSTSGPSK         3      12   gene          ENSG00000099949.21  -1         26       1             0           CIRC-ENST00000642151.1-E1-E2
PFRRSTSGPSK           CIRC-ENST00000642151.1-E1-E2|2  PFR               0      3    gene          ENSG00000099949.21  133        142      0             0           CIRC-ENST00000642151.1-E1-E2
PFRRSTSGPSK           CIRC-ENST00000642151.1-E1-E2|2  RRSTSGPSK         2      11   gene          ENSG00000099949.21  -1         26       1             0           CIRC-ENST00000642151.1-E1-E2

Closes #858

Checklist

lydiayliu commented 8 months ago

The peptide table showcases this really nicely! Now we can finally know what we are actually looking at in terms of circrnas XD