moineaulab / CRISPRStudio

CRISPRStudio
GNU General Public License v3.0
3 stars 3 forks source link

Error message unclear #10

Open jzrapp opened 1 year ago

jzrapp commented 1 year ago

Hi @moineaulab and @plpla

I'm running CRISPR studio on a large dataset and I received this error after the spacermatch.mcl files has been created:

Clustering the spacers
Traceback (most recent call last):
  File "/home/jorap2/.conda/envs/crisprStudio/bin/CRISPR_Studio", line 855, in <module>
    main()
  File "/home/jorap2/.conda/envs/crisprStudio/bin/CRISPR_Studio", line 808, in main
    clDict = attributeClsColor(outFasta+'_fasta36.spacermatch.mcl', spacerDict)
  File "/home/jorap2/.conda/envs/crisprStudio/bin/CRISPR_Studio", line 248, in attributeClsColor
    spacerDict[it].extend([str(colBk), str(colFr), clName])
KeyError: 'SI3U_Ga0307490_1000089_Altererythrobacter||CRISPR1_SPACER1_1'

Could you help me understand the issue? Has it to do with the long headers that I gave to the sequences?

Thanks a lot! Josephine

plpla commented 1 year ago

Hello @jzrapp, I have not looked at this code for a while now. Your header is quite long. Can you try making it shorter and maybe remove the "||"? If that does not solve the issue, let me know and I'll investigate! :)

jzrapp commented 1 year ago

Hi @plpla, the "||" is being introduced by your software. The original headers look like this "SI3U_Ga0307490_1000089_Altererythrobacter".