kidsneuro-lab / RNA_splice_tool

0 stars 1 forks source link

Error in `[.data.table`(all_splicing_events_sample, , ..ctrlscols) : column(s) not found: pct_ #6

Closed ChiaraF32 closed 1 year ago

ChiaraF32 commented 1 year ago

Hi Rhett,

I have run into an error whilst running cortar on some RNAseq from fibroblast samples.

Any chance you could help identy what the problem is?

samplesheet: FIBRO_samples.txt

Run Script:

cortar(
  file = "./FIBRO_samples.txt",
  mode = "default",
  assembly = "hg38",
  annotation = "UCSC",
  paired = TRUE,
  stranded = 2,
  output_dir = "./output",
  genelist = c("ABCD3")
)

Error log:

Error in `[.data.table`(all_splicing_events_sample, , ..ctrlscols) : 
  column(s) not found: pct_
In addition: Warning messages:
1: In .infer_intron_strand(unoriented_intron_motif) :
  For some junctions, the dinucleotides found at the intron boundaries don't
  match any of the natural intron motifs stored in predefined character vector
  'NATURAL_INTRON_MOTIFS'. For these junctions, the intron_motif and
  intron_strand metadata columns were set to NA and *, respectively.
2: In .infer_intron_strand(unoriented_intron_motif) :
  For some junctions, the dinucleotides found at the intron boundaries don't
  match any of the natural intron motifs stored in predefined character vector
  'NATURAL_INTRON_MOTIFS'. For these junctions, the intron_motif and
  intron_strand metadata columns were set to NA and *, respectively.

Full output:

Running cortar 
        file: ./FIBRO_samples.txt
        mode: default
    assembly: hg38
  annotation: UCSC
      paired: TRUE
    stranded: 2
      output: ./output

Selecting genes and transcripts...

Extracting and counting reads...
    D21_0076
[W::hts_idx_load2] The index file is older than the data file: /Users/00104561/Library/CloudStorage/OneDrive-TheUniversityofWesternAustralia/PERKINS/RNAseq/cortar/input/D21-0076.markdup.sorted.bam.bai
    PW_FB_20

Annotating and quantifying events...

Comparing samples...
    D21_0076
Error in `[.data.table`(all_splicing_events_sample, , ..ctrlscols) : 
  column(s) not found: pct_
In addition: Warning messages:
1: In .infer_intron_strand(unoriented_intron_motif) :
  For some junctions, the dinucleotides found at the intron boundaries don't
  match any of the natural intron motifs stored in predefined character vector
  'NATURAL_INTRON_MOTIFS'. For these junctions, the intron_motif and
  intron_strand metadata columns were set to NA and *, respectively.
2: In .infer_intron_strand(unoriented_intron_motif) :
  For some junctions, the dinucleotides found at the intron boundaries don't
  match any of the natural intron motifs stored in predefined character vector
  'NATURAL_INTRON_MOTIFS'. For these junctions, the intron_motif and
  intron_strand metadata columns were set to NA and *, respectively.
rhettmarchant commented 1 year ago

Hi Chiara,

Yes. I can certainly help out. By default, cortar doesn't compare samples that are listed with the same gene name to avoid using samples as controls for test samples with defects in the same gene.

At the moment, you cannot leave the genes or transcript fields blank in non-test samples but I will update this and the documentation next week.

For now, putting a different valid gene and transcript next to your control sample instead of the test gene/transcript, should fix the problem.

Sorry about that! Let me know how it goes.

Cheers, Rhett

ChiaraF32 commented 1 year ago

Hi Rhett,

Thanks for letting me know. I re-ran it and it worked :)

Cheers, Chiara