Closed ChiaraF32 closed 1 year ago
Hi Chiara,
Sorry for not replying to this!
Can you confirm that the assembly, annotation, paired and stranded parameters are all correct for your samples?
I have seen this error before but I don't recall exactly how I fixed it, I'd have to do some digging.
Thanks, Rhett
Hi Rhett,
I used NCBI's GRCh38 FASTA and GTF files as genome references during alignment, available here, not UCSC or 1000 Genomes.
So I assume this is the issue. What would be my options for solving this without re-aligning?
Cheers, Chiara
Hi Chiara,
I can't say for sure that that is the issue here but I suspect it is. I've added a test branch to the repo which includes support for the NCBI_GRCh38 genome.
If you could run the following and let me know how it goes, that would be great.
remove.packages("cortar")
devtools::install_github("kidsneuro-lab/RNA_splice_tool@NCBI_GRCh38")
cortar(
file = "./sample.tsv",
mode = "research",
assembly = "hg38",
annotation = "NCBI",
paired = TRUE,
stranded = 2,
output_dir = "./output",
genelist = c("TRDN")
)
Hope this helps,
Rhett :)
Hi Chiara,
On second thoughts, the fix I have made is unlikely to work.
I'll work on fixing this problem today.
Thanks, Rhett
Hi Chiara,
Sorry for the delay,
Could you try the aforementioned code?
Thanks, Rhett
Hi Rhett,
I ran the code and got the following output and error:
Running cortar
file: ./sample.tsv
mode: research
assembly: hg38
annotation: NCBI
paired: TRUE
stranded: 2
output: ./output
Selecting genes and transcripts...
Extracting and counting reads...
TAM_M6932
[W::hts_idx_load2] The index file is older than the data file: /Users/00104561/Library/CloudStorage/OneDrive-TheUniversityofWesternAustralia/PERKINS/RNAseq/cortar/TAM-M6932.markdup.sorted.bam.bai
Error in .io_bam(.scan_bamfile, file, reverseComplement, yieldSize(file), :
seqlevels(param) not in BAM header:
seqlevels: ‘NC_000006.12’
file: /Users/00104561/Library/CloudStorage/OneDrive-TheUniversityofWesternAustralia/PERKINS/RNAseq/cortar/TAM-M6932.markdup.sorted.bam
index: /Users/00104561/Library/CloudStorage/OneDrive-TheUniversityofWesternAustralia/PERKINS/RNAseq/cortar/TAM-M6932.markdup.sorted.bam
Hi Chiara,
I think I've finally worked out the issue. I was able to recreate your error and have at least a temporary fix for this problem.
cortar
was unable to process when a bamfile had a read that wasn't assigned a strand ( instead of + or -). The fix should let your samples run as normal and flag any -strand reads in the final report output.
Could you run this and see if it fixes the issue?
remove.packages("cortar")
devtools::install_github("kidsneuro-lab/RNA_splice_tool@unknown-strand-error")
cortar(
file = "./sample.tsv",
mode = "research",
assembly = "hg38",
annotation = "UCSC",
paired = TRUE,
stranded = 2,
output_dir = "./output",
genelist = c("TRDN")
)
Also going forward, just use UCSC
as the annotation as it is only used to determine which chromosome name style to use for investigating the bamfile (e.g. "chr1" [UCSC/NCBI] vs "1" [1000genomes]). The coordinates should be consistent across NCBI_GRCh38 and UCSC_GRCh38 alignments.
Thanks for your patience, let me know how you go :)
Cheers, Rhett
Hi Rhett,
Your solution worked! Thanks so much for your help, the outputs look great!
Cheers, Chiara
Hi Rhett,
I managed to successfully install cortar by first installing R version 4.1.3 compiled for Intel (rather than Arm64, which wouldn't work - I have Apple Silicon). Just for reference, I had to use Biomanager to install the following dependencies, prior to running your installation script:
devtools::install_github("kidsneuro-lab/RNA_splice_tool")
This was to prevent an error message suggesting the dependencies were not available for cortar.I am now having issues with running cortar.
I made a sample annotation file, which I have uploaded for reference.
I then ran the following:
Which returned the following output and error message:
If you could help me solve this issue that would be great!
Thanks, Chiara :) samples.txt