JoseBlanca / seq_crumbs

Little sequence file utilities meant to work within Unix pipelines
Other
37 stars 10 forks source link

Problems with command #5

Open andrea-campisano opened 10 years ago

andrea-campisano commented 10 years ago

First things first, i am an absolute Unix noob second: on Biolinux 7

IEEGDJQ01.fasta was generated by sff_extract andrea@andrea-laptop[August 2013] ls [11:51AM] IEEGDJQ01.sff_SCAPHOIDEUS IEEGDJQ03.sff legenda.txt IEEGDJQ02.sff IEEGDJQ04.sff sff_extract_0_3_0 andrea@andrea-laptop[August 2013] cd IEEGDJQ01.sff_SCAPHOIDEUS [12:15PM] andrea@andrea-laptop[IEEGDJQ01.sff_SCAPHOIDEUS] ls [12:16PM] IEEGDJQ01.fasta IEEGDJQ01.fasta.qual IEEGDJQ01.sff IEEGDJQ01.xml andrea@andrea-laptop[IEEGDJQ01.sff_SCAPHOIDEUS] sff_extract IEEGDJQ01.sff Working on 'IEEGDJQ01.sff': Converting 'IEEGDJQ01.sff' ... done. Converted 127860 reads into 127860 sequences.


WARNING: weird sequences in file IEEGDJQ01.sff

After applying left clips, 85069 sequences (=67%) start with these bases: A

This does not look sane.

Countermeasures you probably must take: 1) Make your sequence provider aware of that problem and ask whether this can be corrected in the SFF. 2) If you decide that this is not normal and your sequence provider does not react, use the --min_left_clip of sff_extract. (Probably '--min_left_clip=6' but you should cross-check that)


THE PROBLEM IS LATER WHEN I TRIM

andrea@andrea-laptop[IEEGDJQ01.sff_SCAPHOIDEUS] trim_by_case IEEGDJQ01.fasta [12:22PM] Traceback (most recent call last): File "/usr/local/bin/trim_by_case", line 25, in from crumbs.trim import TrimLowercasedLetters, TrimOrMask, seq_to_trim_packets File "/usr/local/lib/python2.7/dist-packages/crumbs/trim.py", line 32, in File "/usr/local/lib/python2.7/dist-packages/crumbs/pairs.py", line 18, in ImportError: cannot import name _index

I do not know how to handle the importError...

JoseBlanca commented 10 years ago

Hi Andrea: You shouldn't deal with the importError, we should. I've run the current version of trim_by case and I haven't been able to reproduce the problem, so in orther to fix the problem I need some more information. Could you send us the error file that trim_by_case has generated? Could you take a look at the head of IEEGDJQ01.fasta and check taht it is OK? By the way, are you sure that you're dealing with a fasta file? sff_extract will create fastq files and not fasta files. Best,

Jose Blanca