Open almita opened 2 months ago
Hi,
Would you be able to share the data that are causing the error? If you can share a sample that has a problem I will take a look and see if I can reproduce the issue.
Thanks!
I tried running it again on a smaller scale (the first 1,000 spacers and a database of the first 100 contigs) and there was no error. I was running it originally with 31,959 spacers and a database of 4040 contigs, so I'm not sure if the amount of data is related to the error?
Hmmm... It certainly could have something to do with the number of spacers and size of database. I tried testing on the largest database I could put together locally, which is 14,000 spacers and 1350 assemblies (each around 6Mbp). I ran with the same command you described, but didn't see any error. That returned 3,306,444 protospacers, so a pretty big dataset.
Without being able to reproduce the issue there's not much I can do. Perhaps you can try splitting your dataset into batches of spacers to see if you can identify a subset of spacers that cause the issue? If it is a spacer-related issue then I will be happy to look into it more.
I split the spacers in 2 and the first half (15,979 spacers) ran fine, the second half (15,980 spacers) gave me an error. I split that second half into 2 and both halves (7,990 spacers each) gave me the error. I used the full database of 4040 assemblies for all runs.
Interesting! If you can split it down to a manageable size that you are able to share with me then I am happy to track down the issue. Would you be willing to share some data with me?
Would you mind splitting the assembly db in quarters (or further if you're willing to)? That should be a more manageable. We can use this service to share the data (again, if you are willing). https://www.swisstransfer.com/en-us
Sure, I split it down to 1500 spacers and 1010 assemblies, is that manageable?
That should be good. Thank you!
Here are the files, its the fasta file with all the assemblies and another fasta for the spacers: https://www.swisstransfer.com/d/6dccf627-1701-4b6d-8e74-28a6cb300df3
Hi, I'm running cctk from this container using the
1.0.2--pyhdfd78af_0
version. I ranminced
without issue and I'm trying to blast the spacer sequences to a set of contigs (I've already run themakeblastdb
command on them successfully). However when I runspacerblast
in the container:cctk spacerblast -d blastdb -s CRISPR_spacers.fna -o spacerblast.txt -t 32
I get the following error:
I tried specifying
-p 90
because I thought maybe that was missing but I get the same error.