Benjamin-Lee / CodonAdaptationIndex

Python Implementation of Codon Adaption Index
https://cai.readthedocs.io
MIT License
34 stars 9 forks source link

Bug with sequence length #7

Open celiosantosjr opened 5 years ago

celiosantosjr commented 5 years ago

I am testing CAI package following your instructions and I got a bug when processing small sequences. My sequence is:

>testseq
ATGAAATTAATATTGAAACTCGTGGAACGGAAAAAACTGATCAAGGAGTTAAAAGAAGATATTGAAGTAATTTAA

Then, when I execute the program:

>>> reference = [seq.seq for seq in SeqIO.parse("../database/annotation/1036673.PRJNA67335/1036673.PRJNA67335.ffn", "fasta")]
>>> CAI(sequence, reference=reference)

I receive this message:

  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/CAI/CAI.py", line 220, in CAI
    weights = relative_adaptiveness(sequences=reference, genetic_code=genetic_code)
  File "/usr/local/lib/python3.7/dist-packages/CAI/CAI.py", line 149, in relative_adaptiveness
    RSCUs = RSCU(sequences, genetic_code=genetic_code)
  File "/usr/local/lib/python3.7/dist-packages/CAI/CAI.py", line 75, in RSCU
    raise ValueError("Input sequence not divisible by three")
ValueError: Input sequence not divisible by three

The problem is, this is a coding sequence that we work for long time now and it is divisible in codons (presents even the start and stop codon there). So, I do not know what is misinterpreted by the package. I look forward to hearing from you.

HarryMWinters commented 4 years ago

Have you tried? In case some white space is getting in there.

reference = [seq.seq for seq in SeqIO.parse("../database/annotation/1036673.PRJNA67335/1036673.PRJNA67335.ffn", "fasta")]
CAI(sequence.strip(), reference=reference)