cmagnabosco / ea-utils

Automatically exported from code.google.com/p/ea-utils
0 stars 0 forks source link

fastq-mcf not working correctly on ubuntu 14.04 #35

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.fastq-mcf trim on Ubuntu 12.04:
kathryn@sol:~$ fastq-mcf -o SRR652093.fastq.clip -x 0 -k 0 -l 20 -L 40 -q 2 
--qual-mean 20 adaptors.fa SRR652093.fastq
Command Line: -o SRR652093.fastq.clip -x 0 -k 0 -l 20 -L 40 -q 2 --qual-mean 20 
adaptors.fa SRR652093.fastq
Scale used: 2.2
Phred: 33
Threshold used: 107 out of 42564
Adapter 5'_rev (TTTCAGGTGCCTACGATCATGCTGATGGCGCGAGGGAGGC): counted 38101 at the 
'end' of 'SRR652093.fastq', clip set to 1
Adapter 3'_for (CATGATTGATGGTGCCTACAG): counted 41709 at the 'start' of 
'SRR652093.fastq', clip set to 1
Files: 1
Total reads: 42564
Too short after clip: 6070
Filtered on quality: 980
Clipped 'start' reads: Count: 35988, Mean: 20.97, Sd: 0.86
Clipped 'end' reads: Count: 27073, Mean: 36.90, Sd: 7.98

2.Output from the same command, same adaptor file and same fastq file run on 
Ubuntu 14.04:
kathryn@kathryn-linux:/local/work$ fastq-mcf -o SRR652093.fastq.clip -x 0 -k 0 
-l 20 -L 40 -q 2 --qual-mean 20 adaptors.fa SRR652093.fastq
Command Line: -o SRR652093.fastq.clip -x 0 -k 0 -l 20 -L 40 -q 2 --qual-mean 20 
adaptors.fa SRR652093.fastq
Scale used: 2.2
Phred: 33
Threshold used: 107 out of 42564
Adapter 5'_rev (TTTCAGGTGCCTACGATCATGCTGATGGCGCGAGGGAGGC): counted 38101 at the 
'end' of 'SRR652093.fastq', clip set to 1
Files: 1
Total reads: 42564
Too short after clip: 1
Filtered on quality: 38
Clipped 'end' reads: Count: 32928, Mean: 36.75, Sd: 8.03

3.Check that the missing adaptor is in the fastq file:
kathryn@kathryn-linux:/local/work$ grep -c '^CATGATTGATGGTGCCTACAG' 
SRR652093.fastq 
41068

What is the expected output? What do you see instead?
I would the outputs on the two systems to be the same.  However, on the more 
recent version of Ubuntu the adaptor at the 'start' of the reads is not found 
or trimmed

What version of the product are you using? On what operating system?
I have tested 1.1.2-686 and 1.1.2-780.  Both work as expected on Ubuntu 10.04 
and 12.04 but not 14.04.

Please provide any additional information below.
I have tried this with other data sets and it is consistently the adaptor at 
the 'start' of the sequence that is not recognised or clipped correctly on 
14.04.

I compiled these packages on all 3 versions of Ubuntu myself as follows:
download tarball
tar xzvf <package>
cd <package dir>
make all

For 1.1.2-780 only before the make all:
cd sparsehash-2.0.2
./configure
sudo make install
cd <package dir>

If there are any other instructions for compiling on 14.04 that I missed, I 
apologise!

Original issue reported on code.google.com by crouc...@gmail.com on 3 Sep 2014 at 4:19

GoogleCodeExporter commented 9 years ago
need to attach some data or else i can't debug/fix this.  when i try to 
simulate the issue, everything works fine.

Original comment by earone...@gmail.com on 3 Sep 2014 at 8:53

GoogleCodeExporter commented 9 years ago
(probably like 10 or so reads + adatpors.fa you are using)

Original comment by earone...@gmail.com on 3 Sep 2014 at 8:53

GoogleCodeExporter commented 9 years ago
Here's what I just tried on Ubuntu 14.04 ... worked fine.   So it's not a 
trivial case.

Scale used: 2.2
Phred: 64
Threshold used: 1 out of 4
Adapter 3p_for_test (CATGATTGATGGTGCCTACAG): counted 4 at the 'start' of 
'test5.fq', clip set to 1
Files: 1

---cut---
@1
CATGATTGATGGTGCCTACAGATCAGCTAGGCATCGATATATCGATCGGCTAGAGATATACGATCGAT
+
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
@2
CATGATTGATGGTGCCTACAGATCAGCTAGGCATCGATATATCGATCGGCTAGAGATATACGATCGAG
+
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
@3
CATGATTGATGGTGCCTACAGATCAGCTAGGCATCGATATATCGATCGGCTAGAGATATACGATCGAG
+
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
@4
CATGATTGATGGTGCCTACAGATCAGCTAGGCATCGATATATCGATCGGCTAGAGATATACGATCGAC
+
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

Original comment by earone...@gmail.com on 3 Sep 2014 at 9:18

GoogleCodeExporter commented 9 years ago
also, i fixed the makefile so you don't need to make sparsehash first

Original comment by earone...@gmail.com on 3 Sep 2014 at 9:19

GoogleCodeExporter commented 9 years ago
Apologies - please find attached the adaptors.fa and the first 10 reads of the 
fastq file that I am using.  I verified that I am still seeing the same problem 
with the cut down file.

If you want to play with the whole file, it is available in the NCBI sequence 
read archive - the accession number is the file name.

I have seen this problem with every file I have tried, including test examples 
similar to yours above so I am wondering if it is something screwy with my 
installation of 14.04 - I can't think what though as it is a standard 
installation and up to date, and I didn't see any unusual warnings or anything 
when compiling on 14.04.  If it makes any difference, all the installations of 
Ubuntu I am testing here have 64-bit architecture.

Original comment by crouc...@gmail.com on 4 Sep 2014 at 9:16

Attachments:

GoogleCodeExporter commented 9 years ago
ok! i can reproduce this on a 64-bit virtualbox... i had to update my bios, 
enable hardware acceleration, and update the number of cores to get it to 
break.   i added a new test for it, which nicely passes under other 
versions/instances of ubuntu

Original comment by earone...@gmail.com on 4 Sep 2014 at 3:09

GoogleCodeExporter commented 9 years ago
Ha ha!  Yes, the 14.04 machine is less than a year old, and has 8 cores.  Maybe 
I should just trim on my laptop instead! Glad you have reproduced the problem - 
I was beginning to think that I was just doing something stupid :-)

Original comment by crouc...@gmail.com on 4 Sep 2014 at 3:21

GoogleCodeExporter commented 9 years ago
This has been fixed.   I deprecated the 780 release, and added a new release 
806, which has a) the fix and b) the test for the fix.

Original comment by earone...@gmail.com on 4 Sep 2014 at 3:47

GoogleCodeExporter commented 9 years ago
Great, thanks!

Original comment by crouc...@gmail.com on 4 Sep 2014 at 4:06