charles-plessy / CAGEr

Mirror of Bioconductor's CAGEr package repository
https://bioconductor.org/packages/CAGEr
6 stars 4 forks source link

Give different names to plus and minus strand when exporting to BedGraph #56

Open Hami7407 opened 2 years ago

Hami7407 commented 2 years ago

Hello, I am trying to export to the BedGraph File.

However, I only receive tag cluster file not divided into two strands.

This is what I did

> trk <- exportToTrack(CTSSnormalizedTpmGR(humanS_cluster, "Adult_brain_1S"))
> humanS_cluster |> CTSSnormalizedTpmGR("all") |> exportToTrack(humanS_cluster, oneTrack = FALSE)
GRangesList object of length 2:
[[1]]
UCSC track 'Adult_brain_1S (TC)'
UCSCData object with 1310873 ranges and 5 metadata columns:
            seqnames    ranges strand |    genes annotation filteredCTSSidx     score     itemRgb
               <Rle> <IRanges>  <Rle> |    <Rle>      <Rle>           <Rle> <numeric> <character>
        [1]     chr1    564451      + | MTND1P23   promoter            TRUE   3.05356       black
        [2]     chr1    564455      + | MTND1P23   promoter            TRUE   1.56056       black
        [3]     chr1    564456      + | MTND1P23   promoter            TRUE   3.05356       black
        [4]     chr1    564463      + | MTND1P23   promoter            TRUE   5.97491       black
        [5]     chr1    564560      + | MTND1P23   promoter            TRUE   1.56056       black
        ...      ...       ...    ... .      ...        ...             ...       ...         ...
  [1310869]     chr1 249211204      - |             unknown            TRUE   1.56056       black
  [1310870]     chr1 249212177      - |             unknown            TRUE   1.56056       black
  [1310871]     chr1 249220686      - |             unknown            TRUE   1.56056       black
  [1310872]     chr1 249221821      - |             unknown            TRUE   1.56056       black
  [1310873]     chr1 249239784      - |             unknown            TRUE   1.56056       black
  -------
  seqinfo: 298 sequences (2 circular) from hg19 genome
[[2]]
UCSC track 'Adult_brain_1S (TC)'
UCSCData object with 502786 ranges and 5 metadata columns:
           seqnames    ranges strand |    genes annotation filteredCTSSidx     score     itemRgb
              <Rle> <IRanges>  <Rle> |    <Rle>      <Rle>           <Rle> <numeric> <character>
       [1]     chr1     82726      + |             unknown            TRUE   3.68027       black
       [2]     chr1    535277      + |             unknown            TRUE   3.68027       black
       [3]     chr1    540765      + |             unknown            TRUE   3.68027       black
       [4]     chr1    564575      + | MTND1P23   promoter            TRUE   3.68027       black
       [5]     chr1    564587      + | MTND1P23   promoter            TRUE  11.45750       black
       ...      ...       ...    ... .      ...        ...             ...       ...         ...
  [502782]     chr1 249200594      - |             unknown            TRUE   3.68027       black
  [502783]     chr1 249200611      - |             unknown            TRUE   3.68027       black
  [502784]     chr1 249200695      - |             unknown            TRUE   3.68027       black
  [502785]     chr1 249200790      - |             unknown            TRUE   3.68027       black
  [502786]     chr1 249201330      - |             unknown            TRUE   3.68027       black
  -------
  seqinfo: 298 sequences (2 circular) from hg19 genome

> trk <- split(trk, strand(trk), drop = TRUE)
> rtracklayer::export.bedGraph(trk, "Adult_brain_1S")
BiocFileList of length 2

Same thing happens when I used rtracklayer::export.bedGraph(trk, "Adult_brain_1S.bedGraph")

How can I get a bedGraph file of two different strands?

Thank you

charles-plessy commented 2 years ago

Dear Hami,

in my hands, with the example data from the vignette, the command works properly and produces one file containing two tracks. However, the tracks have the same name; is that the cause of your problem?

Have a nice day,

-- Charles

Hami7407 commented 2 years ago

image

hmm... I only can download CAGE Tag cluster file which are not divided into two strands even though I export the data after split them into two. When I open the file, it has only positive numbers.

I thought I don't have to specify which strand I want to download if I use export.bedGraph command. Could you help me with this? It was all working with the CAGEr previous version.

Thank you,

Best

charles-plessy commented 2 years ago

Can you try something like:

trkBG <- split(trk, strand(trk), drop = TRUE)
trkBG[['+']]@trackLine@name <- paste0(trkBG[['+']]@trackLine@name, " +")
trkBG[['-']]@trackLine@name <- paste0(trkBG[['-']]@trackLine@name, " -")
trkBG[['+']]@trackLine@description <- paste0(trkBG[['+']]@trackLine@description, ", + strand")
trkBG[['-']]@trackLine@description <- paste0(trkBG[['-']]@trackLine@description, ", - strand")
rtracklayer::export.bedGraph(trkBG, "myBedGraphTrack.bedGraph")

If it works I will correct the documentation and try to add a function for such strand splitting.

Hami7407 commented 2 years ago

Thank you for the suggestion! However, it only gives me + strand... somehow I can't download minus strand image

Sorry for the issues!

charles-plessy commented 2 years ago

Can you double-check that the minus-strand information is present in the data you send to the UCSC browser? On my computer, with CAGEr's example data I have:

$ grep track myBedGraphTrack.bedGraph 
track name="Zf.30p.dome (TC)+" description="Zf.30p.dome (CAGE Tag Clusters (TC))" visibility=full type=bedGraph
track name="Zf.30p.dome (TC)-" description="Zf.30p.dome (CAGE Tag Clusters (TC))" visibility=full type=bedGraph

Also, if I remember well the previous version of CAGEr was exporting the plus and minus strand in separate files. You can still do that too with something like:

rtracklayer::export.bedGraph(trkBG[['+']], "myBedGraphTrackPlus.bedGraph")
rtracklayer::export.bedGraph(trkBG[['-']], "myBedGraphTrackMinus.bedGraph")
Hami7407 commented 2 years ago

trkBG <- split(trk, strand(trk), drop = TRUE) trkBG[['+']]@trackLine@name <- paste0(trkBG[['+']]@trackLine@name, " +") trkBG[['-']]@trackLine@name <- paste0(trkBG[['-']]@trackLine@name, " -") trkBG[['+']]@trackLine@description <- paste0(trkBG[['+']]@trackLine@description, ", + strand") trkBG[['-']]@trackLine@description <- paste0(trkBG[['-']]@trackLine@description, ", - strand") rtracklayer::export.bedGraph(trkBG, "myBedGraphTrack.bedGraph")

Sorry, I checked again with UCSC browser and minus strand exists! so the last code you gave me works!

Thank you