moineaulab / CRISPRStudio

CRISPRStudio
GNU General Public License v3.0
3 stars 3 forks source link

Contig name length limit #7

Open sabrinadiemert opened 5 years ago

sabrinadiemert commented 5 years ago

Aloha @moineaulab,

Nifty tool, thanks! As an FYI, I found a minor issue when using SPAdes-generated draft assemblies to CRISPRDetect, then feeding this .gff output into CRISPR_Studio via conda:

Aligning the spacers with fasta36 aligner Clustering the spacers Traceback (most recent call last): File "/home/sabrina/.conda/envs/bioinfo/bin/CRISPR_Studio", line 855, in main() File "/home/sabrina/.conda/envs/bioinfo/bin/CRISPR_Studio", line 808, in main clDict = attributeClsColor(outFasta+'_fasta36.spacermatch.mcl', spacerDict) File "/home/sabrina/.conda/envs/bioinfo/bin/CRISPR_Studio", line 248, in attributeClsColor spacerDict[it].extend([str(colBk), str(colFr), clName]) KeyError: 'NODE_19_length_88379_cov_18.297296||CRISPR1_SPACER1_5446_547'

Shortening the contig names (i.e. to NODE_19) prevents this dictionary failure. I didn't read into the code too closely, but I can see in the KeyError that the CRISPR1_SPACER1 is missing the last digit of the range.