dylanbeeber / crispRdesignR

Software used to design guide RNA sequences for CRISPR/Cas9 genome editing
13 stars 9 forks source link

Error in .replace_seqlevels_style #14

Open El-Castor opened 10 months ago

El-Castor commented 10 months ago

Hello,

First of all, thank you for your tool it is very usefull. I am trying to use it for a custom genome. I build the necessary BSgenome library and install my genome successfully.

I success also to install and launch crispRdesignR GUI Shiny app and launch the sgRNA desing and off-target assessment. Unfortunatly at the and of the computing the app become grey and it blocked without producing the off-target output file.

Below you have the terminal showed error:

> crispRdesignRUI()
Le chargement a nécessité le package : shiny

Listening on http://127.0.0.1:3591

Avis : Error in .replace_seqlevels_style: found no sequence renaming map compatible with seqname style "UCSC" for this object
  2: shiny::runApp
  1: crispRdesignRUI

Do you have any idea what I am doing wrong ? Tell me if you need more specific informations.

Thank in advance for your help

vmurigneu commented 10 months ago

Hello,

I am having the same issue when trying to run crispRdesignR from the command line on a custom genome available on VectorBase. The BSgenome is installed successfully. The error message showed up at the "annotating off-targets" step. There is no error when the off-target annotation step is skipped using the parameter "annotateoffs = FALSE". Please see below the command line and the error message:

alldata <- sgRNA_design(testseq, usergenome, gtfname, calloffs = TRUE, annotateoffs = TRUE)
............................
Checking for Off-Targets in supercont2.307
Checking for Off-Targets in supercont2.308
Checking for Off-Targets in supercont2.309
Checking for Off-Targets in supercont2.310
Annotating off-targets
Error in .replace_seqlevels_style(x_seqlevels, value) : 
  found no sequence renaming map compatible with seqname style "UCSC" for this object

I tried to create the BSGenome package following the documentation using two alternative ways:

forgeBSgenomeDataPkgFromNCBI(assembly_accession="GCA_000473445.2",pkg_maintainer=NA,organism="Anopheles farauti",destdir="BSgenomeForge/",circ_seqs=character(0))

The BSGenome sequences names obtained as a result of forgeBSgenomeDataPkgFromNCBI are different to the ones contained in the VectorBase so I renamed the chromsomes names in the GTF file to match the BSGenome sequences names.

Thank you for your help

El-Castor commented 10 months ago

Hello @vmurigneu,

For me, the problem doesn't come from the custom BS genome package created for this purpose. It comes directly from the GTF annotation files that I upload to crispRdesignR. When I checked the code of the function inducing the error, I saw that it came from this command:

GenomeInfoDb::seqlevelsStyle(gtf) <- "UCSC"

The problem is that the gtf is not in UCSC format and I haven't managed to format it with the tools available so far. Because in my case I'm at contig level, so the chromosome names don't follow UCSC recommendations.

Is the genome you are using on a database or have you created a BSgenome package of your genome directly without going through a database?

The replacement of sequence names has not fixed your problem if I have understood correctly. Have you also checked that the coordinates of the sequences are similar because they may be on different conventions and therefore with a difference of 1 base pair.

Can you please make me a head of your gtf annotation file and your chromosome size file for example where there are the names of the sequences in your genome?

Let me know if you success to fix this issue or suggestion please,

Thank you two

vmurigneu commented 9 months ago

Hi @El-Castor

Yes I agree with you, the error comes from this line of code. I did create the BSgenome package (using two different methods, see my previous post) as the species was not already available.

The gtf looks like this:

KI915047    VEuPathDB   transcript  6416    30695   .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431"
KI915047    VEuPathDB   exon    6416    8953    .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    9122    9262    .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    9576    9730    .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    9811    9973    .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    11716   11907   .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    14033   14169   .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    18283   18330   .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    26202   26576   .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";
KI915047    VEuPathDB   exon    30609   30695   .   -   .   transcript_id "AFAF011431-RA"; gene_id "AFAF011431";

It was converted from GFF. Does your GTF file contains the header? The header was removed during the gff to gtf conversion.

Thank you

maltesemike commented 7 months ago

I am having the exact same problem with my custom genome. I built the BSgenome successfully and get the same error when I look for off-targets. It's rather pointless without the ability to use this option.

Has there been a solution to the problem? Thanks!

vmurigneu commented 7 months ago

hi @maltesemike

no I suspect solving this will require modifying the R code. My understanding from reading a bit about the GenomeInfoDb package is that custom genome are not supported in regards to the seqlevelsStyle UCSC style.

maltesemike commented 7 months ago

thanks @vmurigneu

this is a real shame, it makes the tool completely unusable for non-model genomes, which is what it was purported to do.

any chances of a quick fix @dylanbeeber ?