dimikalen / ea-utils

Automatically exported from code.google.com/p/ea-utils
0 stars 0 forks source link

Inconsistant fastq-mcf between servers #14

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

If I run the following command on two different servers (m1 and m2):

fastq-mcf -q 30 --qual-mean 25 /ngs/transcript/adap_mint_Illum_3.fas 
PRI_ALZH_CrIT_ACAGTC_L002_R1_005.fastq.gz 
PRI_ALZH_CrIT_ACAGTC_L002_R2_005.fastq.gz -o 
PRI_ALZH_CrIT_ACAGTC_L002_R1_005_trim.fastq -o 
PRI_ALZH_CrIT_ACAGTC_L002_R2_005_trim.fastq

I am getting two different results: one is the expected one (on m1), the other 
totally missed some adapters (m2)... On both machine I have compiled the same 
ea-utils versions (ea-utils.1.1.2-537) and I am using the exact same files.

One example for the following adapter:

>MINTadapterThreeprimeA
AAGCAGTGGTATCAACGCAGAGTACTTTTT

On m1:

[tristan@cabecou test_trim] zgrep -c AAGCAGTGGTATCAACGCAGAGTACTTTTT 
PRI_ALZH_CrIT_ACAGTC_L002_R1_005_trim.fastq.gz   
0 

On m2:

[tristan@umr5023-proasellus Sample_PRI_ALZH_CrIT] grep -c 
AAGCAGTGGTATCAACGCAGAGTACTTTTT PRI_ALZH_CrIT_ACAGTC_L002_R2_005_trim.fastq
79601

I guess it must be linked to some soft version differences between the two 
machines. Regarding GCC:

On m1:

[tristan@cabecou test_trim] gcc -v                                              

Using built-in specs.                                                           

COLLECT_GCC=gcc                                                                 

COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper               

Target: x86_64-linux-gnu                                                        

Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
4.7.2-2ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs 
--enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-4.7 --enable-shared --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object 
--enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1) 

On m2:

[tristan@umr5023-proasellus Sample_PRI_ALZH_CrIT] gcc -v
Utilisation des specs internes.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper
Target: x86_64-linux-gnu
Configuré avec: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-4.6 --enable-shared --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object 
--enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Modèle de thread: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) 

Any help much appreciated, m2 being our production machine, our pipeline is 
therefore stopped...

Thanks,
--
Tristan Lefebure

Original issue reported on code.google.com by tristan....@gmail.com on 25 Mar 2013 at 9:23

GoogleCodeExporter commented 9 years ago
I have just tried the following: copy the binary from m1 to m2, and run the 
m1's fastq-mcf on m2: it works. Something specific to m2 happens during the 
compilation (gcc does not produce any error message during the compilation of 
fastq-mcf). 

Original comment by tristan....@gmail.com on 25 Mar 2013 at 10:21

GoogleCodeExporter commented 9 years ago
Also, version ea-utils.1.1.2-484 does not have this problem on m2

Original comment by tristan....@gmail.com on 25 Mar 2013 at 10:54

GoogleCodeExporter commented 9 years ago
is it possible for you to post the adapters files and at least a subset of the 
fastq's with the incorrect behavior?  i'm running those same Ubuntu gcc version 
on several boxes.

Original comment by earone...@gmail.com on 28 May 2013 at 5:11

GoogleCodeExporter commented 9 years ago
Have you tried just getting the "latest version"?   r600+ ?

Original comment by earone...@gmail.com on 3 Jun 2013 at 1:23

GoogleCodeExporter commented 9 years ago
Hi,
I just tested the latest SVN version, and using the same example got:

zgrep -c AAGCAGTGGTATCAACGCAGAGTACTTTTT test1.fastq.gz
817

(expected is 0, and raw data gives: 92403)

So that's much better. I still wonder why ~800 out of the 92403 adapters still 
find their way to the output...

My adapter file is the following:

>MINTadapterFiveprime
AAGCAGTGGTATCAACGCAGAGTACGGGGG
>MINTadapterThreeprimeA
AAGCAGTGGTATCAACGCAGAGTACTTTTT
>PE1adapter
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
>PE2adapter
ACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PE1PCRprimer
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PE2PCRprimer
CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
>Illumina_Multiplexing_PCR_Primer_2.01
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
>C_Illumina_Multiplexing_PCR_Primer_2.01
AGATCGGAAGAGCACACGTCTGAACTCCAGTC
>adapThreep_TAGATG
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGATGATCTCGTATGCCGTCTTCTGCTTG
>adapThreep_CTTGTA
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG
>adapThreep_GGCTAC
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG
>adapThreep_ATCACG
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
>Mint2-Threep-CDS-4M adapter
AAGCAGTGGTATCAACGCAGAGTGGCCGAGGCGGCCTTTTGTTTTTTCTTTTTTTTTTTTT
>C_adapFivep
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
>C_MINT2-Fivep-PlugOligo-3M-adapter
CCCCCGGCCGTAATGGCCACTCTGCGTTGATACCACTGCTT
>C_MINT2-Fivep-PlugOligo-1-adapter
CCCGCGTACTCTGCGTTGATACCACTGCTT
>C_MINT1adapterFiveprime
CCCCCGTACTCTGCGTTGATACCACTGCT

Best,

Original comment by tristan....@gmail.com on 3 Jun 2013 at 2:29

GoogleCodeExporter commented 9 years ago
i'm guessing that those adapters are either a) not at the end or b) masked by 
too much mismatch?

Original comment by earone...@gmail.com on 3 Jun 2013 at 5:47

GoogleCodeExporter commented 9 years ago
... but that's just a guess.  i'd need to see the stderr output as well, and 
would really like to be able to run in a debug mode on the original file, and 
try to isolate the reads that are an issue, etc.

Original comment by earone...@gmail.com on 3 Jun 2013 at 5:50

GoogleCodeExporter commented 9 years ago
Can't reproduce.  No resp in 2 months, closing.   

Original comment by earone...@gmail.com on 12 Aug 2013 at 1:24