crisprVerse / crisprDesign

Comprehensive design of CRISPR gRNAs for nucleases and base editors
MIT License
19 stars 6 forks source link

addSpacerAlignments Error in .new_IRanges_from_start_width(start, width) : 'start' or 'width' cannot contain NAs #10

Closed MatthewPace98 closed 1 year ago

MatthewPace98 commented 1 year ago

From this call:

addSpacerAlignments(guideSet,
           aligner="bowtie",
           aligner_index=canFam3,
           bsgenome=BSgenome.Cfamiliaris.UCSC.canFam3,
           n_mismatches=3,
           txObject=txdb_canine)

I get the following output, where IRanges complains:

[runCrisprBowtie] Using BSgenome.Cfamiliaris.UCSC.canFam3
[runCrisprBowtie] Searching for SpCas9 protospacers
# reads processed: 10
# reads with at least one alignment: 10 (100.00%)
# reads that failed to align: 0 (0.00%)
Reported 1808 alignments

 Error in .new_IRanges_from_start_width(start, width) : start' or 'width' cannot contain NAs

The error is not terribly informative, could somebody kindly help troubleshoot this? It works when using the human data in the tutorial.

Jfortin1 commented 1 year ago

Hi @MatthewPace98, thanks for reporting the error, I'll help.

Could you please provide

MatthewPace98 commented 1 year ago

I downloaded the canFam3.fa.gz reference genome from here and generated txdb_canine using this GTF file. The code used is as follows:

txdb_canine <- getTxDb(file = 'canFam3.ncbiRefSeq.gtf')
gr <- queryTxObject(txObject=txdb_canine,
                    featureType="cds",
                    queryColumn="gene_id",
                    queryValue="PARD6G")
guideSet <- findSpacers(gr,
                        bsgenome=bsgenome,
                        crisprNuclease=SpCas9)
guideSet <- addSequenceFeatures(guideSet)

Appreciate your help @Jfortin1.

Jfortin1 commented 1 year ago

Hi @MatthewPace98, thanks again for sharing your reproducible example. Tagging @ltHobbes who helped fix this.

There were a couple of bugs:

Let us know if you can run your code with crisprDesign v.1.1.17 and crisprBowtie v.1.3.3

MatthewPace98 commented 1 year ago

Thank you @ltHobbes and @Jfortin1 for this, it seems to have done the trick for canine data. However it now breaks with the human data used in the tutorial:

data(SpCas9, package="crisprBase")
data(txdb_human, package="crisprDesignData")

# Query GRanges object to extract the exons of the required gene
gr <- queryTxObject(txObject=txdb_human,
                    featureType="cds",
                    queryColumn="gene_symbol",
                    queryValue="KRAS")

guideSet <- addSpacerAlignments(guideSet,
                                aligner="bowtie",
                                aligner_index=bowtie_index,
                                bsgenome=bsgenome,
                                n_mismatches=2,
                                txObject=txdb_human)

Outputs:

[runCrisprBowtie] Using BSgenome.Hsapiens.UCSC.hg38
[runCrisprBowtie] Searching for SpCas9 protospacers
# reads processed: 108
# reads with at least one alignment: 108 (100.00%)
# reads that failed to align: 0 (0.00%)
Reported 3076 alignments
Error: subscript is a logical vector with out-of-bounds TRUE values

I get a similar result if I try generating the txdb object myself using the gtf file.

Jfortin1 commented 1 year ago

@MatthewPace98 Thanks again for reporting this -- the bug was introduced by an unrelated change over the weekend (changing the default argument standard_chr_only to FALSE). Just pushed a fix to v.1.1.21.

MatthewPace98 commented 1 year ago

Yup, both human and canine pipelines are functional. Thanks for your help!