Closed fsnibs10 closed 3 months ago
Hi all,
With the help of my colleague,this bug is now fixed. When constructing the TxDb from the GFF file, I forgot to include the chrominfo parameter, which includes the chromosome length. This is necessary for the genes on the minus strand. All in all, it is a powerful tool.
Hi all,
I also want to show my script for others to use.
> library(BSgenome.Spombe.DY47073.4CrisprVerse)
> mybsgenome <- BSgenome.Spombe.DY47073.4CrisprVerse
> data(SpCas9, package="crisprBase")
> gfffile <- "./DY47073.gff3"
> chrominfo <- data.frame(chrom=c("chrI","chrII","chrIII","chrrDNA_distal_contig1","chrrDNA_distal_contig2","chrMT"),
length=c(5623850,4644548,2508823,16591,8860,19433),
is_circular=c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE))
> pombe_txdb <- getTxDb(file=gfffile,organism=NA,chrominfo=chrominfo)
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
>
> pombe_grList <- TxDb2GRangesList(pombe_txdb)
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
> GenomeInfoDb::genome(pombe_grList) <- "Spombe"
> GenomeInfoDb::seqlengths(pombe_grList)
chrI chrII chrIII
5623850 4644548 2508823
chrrDNA_distal_contig1 chrrDNA_distal_contig2 chrM
16591 8860 19433
###### test gene on the minus strand
> pombe_gene_gr <- queryTxObject(txObject=pombe_txdb,
+ featureType="transcript",
+ queryColumn="gene_id",
+ queryValue="SPCC1322.14c")
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
> pombe_gene_guideSet <- findSpacers(pombe_gene_gr,
+ bsgenome=mybsgenome,
+ crisprNuclease=SpCas9)
Warning:
In .merge_two_Seqinfo_objects(x, y) :
Each of the 2 combined objects has sequence levels not in the other:
- in 'x': chrMT
- in 'y': chrM
Make sure to always combine/compare objects based on the same reference
genome (use suppressWarnings() to suppress this warning).
> pombe_gene_guideSet <- addGeneAnnotation(pombe_gene_guideSet,
+ txObject=pombe_txdb)
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
Hi @fsnibs10, glad you could figure it out!
Hi authors,
My studied organism is Schizosaccharomyces pombe . The latest GFF annotation file is from the Pombase website, not from Ensembl. So I construct the TxDb object by giving the GFF file. The BSgenome is also self constructed by using the BSgenome package. I want to find the spacer sequences targeting a specific gene guided by the website Design_CRISPRko_Cas9.
It is strange that the program worked very well for all genes on the plus strand, but it promotes an error for genes on the minus strand ( 'start' or 'end' cannot contain NAs).
The detailed procession is shown below.
Below are the part of two genes from gff file.
I don't know how to debug this. I am looking forward to your reply.