Closed jacordova closed 1 year ago
This sounds more like a bug in the summarise annotations script we should fix. Will try and get round to that in the coming weeks
Thanks, I appreciate it!
@jacordova I am just looking into this and I'm not seeing why either of these two inputs would trigger this error. Could you please send/attach here the offending file so I can try to reproduce the error.
@johnlees Sure, the files are attached below.
This file is the input to annotate_hits_pyseer
significant_anaerobic_kmers_1E-05.txt
This file is the input to summarise_annotations.py
sig_kmer_anaerobic_annotation_1E-05.txt
The line this is failing on is:
TTTTCCCTCGATCTCTCCCTGGTCAGCCAGCAGTTTCTCTCCCTTGAATTTGCGCAGGTGCTGGACAAAATGGCCTACCTGACGATATGGCAGGGCGATG\t4.90E-01\t5.05E-02\t4.37E-06\t2.72E-03\t5.59E-04\t4.45E-01\tDEC10Auw,DEC10Buw,DEC10Cuw,DEC10Duw,DEC10Euw,DEC10Fuw,DEC11Auw,DEC11Buw,DEC11Cuw,DEC11Duw,DEC11Euw,DEC12Auw,DEC12Buw,DEC12Cuw,DEC12Duw,DEC12Euw,DEC13Auw,DEC13Buw,DEC13Cuw,DEC13Duw,DEC13Euw,DEC14Auw,DEC14Cuw,DEC14Duw,DEC14Euw,DEC15Auw,DEC15Buw,DEC15Cuw,DEC15Duw,DEC15Euw,DEC6Cuw,DEC6Duw,DEC6Euw,DEC7Auw,DEC7Buw,DEC7Cuw,DEC7Duw,DEC7Euw,DEC8Auw,DEC8Buw,DEC8Cuw,DEC8Duw,DEC8Euw,DEC9Auw,DEC9Buw,DEC9Cuw,DEC9Duw,DEC9Euw\tCFT073uw,DEC14Buw,DEC1Auw,DEC1Buw,DEC1Cuw,DEC1Duw,DEC1Euw,DEC2Auw,DEC2Buw,DEC2Cuw,DEC2Duw,DEC2Euw,DEC3Auw,DEC3Buw,DEC3Cuw,DEC3Duw,DEC3ELuw,DEC3ESuw,DEC3Fuw,DEC4Auw,DEC4Buw,DEC4Cuw,DEC4Duw,DEC4Euw,DEC4Fuw,DEC5Auw,DEC5Buw,DEC5Cuw,DEC5Duw,DEC6Auw,DEC6Buw,EDL933uw,MG1655uw,O157H7_82uw,O157H7_83uw,O157H7_84uw,O157H7_85uw,O157H7_86uw,O157H7_87uw,O157H7_88uw,O157H7_89uw,O157H7_90uw,O157H7_91uw,O157H7_92uw,O157H7_93uw,O157H7_94uw,O157H7_95uw,SAKAIuw,UTI89uw,W3110uw\tEco_Sakai_Chromosome;:675190-675289;;;
The problem is Eco_Sakai_Chromosome;
. Is there a semi-colon where these chromosome names are defined e.g. in the fasta header?
Agh looks like that was it! It runs great now. Thank you
Hello,
Thanks for this great piece of software!
I've run into an issue where the
annotate_hits_pyseer
script adds multiple annotations to some kmers. From a quick glance, it seems that the multiple annotations are from the same reference. See examples below:TTTTCTTTTATCA [...] CFT073_Chromosome:2459877-2459889;ABR-0078744;;ABR-0078745,CFT073_Chromosome:1488731-1488743;ABR-0077763;;ABR-0077765
TTACTCAATAAT [...] CFT073_Chromosome:2886042-2886053;ABR-0079164;ABR-0079164;ABR-0079164,CFT073_Chromosome:4541738-4541749;ABR-0080970;ABR-0080970;ABR-0080970,CFT073_Chromosome:4234290-4234301;ABR-0080626;ABR-0080626;ABR-0080626
As a result, this causes the
summarise_annotations.py
script to error out:File "/jcordova2/Pyseer/pyseer-master/scripts/summarise_annotations.py", line 70, in <module> (position, down, inside, up) = annotation.split(";") ValueError: too many values to unpack (expected 4)
After I remove the additional annotations manually, the
summarise_annotations.py
script is able to proceed. Do you have any recommendations on limiting to a single annotation? Or potentially forcing the multiple annotations to move to a new line?Thanks!