vincentdebakker / CRISPRi-seq

Design and efficiency evaluation of genome-wide sgRNA libraries for CRISPRi-seq
GNU General Public License v3.0
3 stars 4 forks source link

Error: subscript contains NAs #3

Closed taosheng19 closed 2 years ago

taosheng19 commented 2 years ago

Hi Vincent, when I run the rscript sgRNA_library_design_cmd.R, I always got the following error information: .... 2022-06-25 23:55:31: Scoring sgRNA - binding site combinations... 2022-06-25 23:55:32: Computing scores for all candidate sgRNAs... Error: subscript contains NAs Execution halted

Seems like something wrong in the "Computing scores for all candidate sgRNAs..." step, is there any solution? Many thanks.

vincentdebakker commented 2 years ago

Hi, What are the exact input parameters that you used? Vincent

taosheng19 commented 2 years ago

Rscript ~/opts/crispri-seq/sgRNA_library_design_cmd.R -g GCF_002076835.1 -d ./ncbi_genomes/ -o ~/opts/crispri-seq/test/ -t ~/opts/crispri-seq/ -P AGAAG --output_full_list --output_target_fasta --keep_TINDRi_matches --output_all_candidates

And the output: 2022-06-28 00:08:55: design pipeline started Using refseq data base. (Please use GCA accession for genbank and GCF for refseq.) 2022-06-28 00:08:55: Loading packages (and installing if needed)... 2022-06-28 00:09:13: Loading genome, features and sequences... 2022-06-28 00:09:19: Identifying all candidate sgRNAs... 2022-06-28 00:09:26: Identifying all sgRNA binding sites... 2022-06-28 00:09:27: Matching sgRNA candidates to binding sites...

Starting at: Tue Jun 28 00:09:27 2022

Alignment matrix generation ended on: Tue Jun 28 00:09:35 2022 Ended on: Tue Jun 28 00:09:35 2022 2022-06-28 00:09:35: Scoring sgRNA - binding site combinations... 2022-06-28 00:09:37: Computing scores for all candidate sgRNAs... Error: subscript contains NAs Execution halted

vincentdebakker commented 2 years ago

Hi,

I managed to reproduce the error and patched it. Could you confirm it works correctly now?

(Unrelated: I recommend using a local genbank file (.gb, .gbk, .gbf or .gbff), to not depend on downloading through the pipeline. You can find yours here, for instance: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/002/076/835/GCF_002076835.1_ASM207683v1/)

taosheng19 commented 2 years ago

Now it works fine. Thank you so much for the quick patch. This tool is absolutely fantastic.

taosheng19 commented 2 years ago

Hi Vincent,

Is it possible to let the rscript accept multiple PAMs as input? Thanks

vincentdebakker commented 2 years ago

Hey,

Great to hear it works and you're happy using it. The pipeline can currently not handle multiple PAMs as input.

I'll close this issue now. Good luck! Cheers