Problems during sequencing

An anonymous user added a new forum post "most transcripts and many genes
have no reads" in thread "most transcripts and many genes have no reads" at
http://fluxcapacitor.wikidot.com/forum/t-409308#post-1301733

Hi everyone,

I have a problem with read simulation. I have defined a .pro file where all
transcripts have very similar abundance.

1:95302304-95320982C    gene_123_iso_1  CDS 697 0.003577    114
1:95308664-95320982C    gene_123_iso_2  CDS 670 0.003138    100
1:203830743-203839678W  gene_1180_iso_1 CDS 942 0.003326    106
1:203832753-203839179W  gene_1180_iso_2 CDS 355 0.003138    100
1:203830731-203839209W  gene_1180_iso_3 CDS 967 0.003169    101
1:203830731-203839212W  gene_1180_iso_4 CDS 532 0.003514    112
1:203830737-203839205W  gene_1180_iso_5 CDS 478 0.002699    86
1:203830978-203834247W  gene_1180_iso_6 CDS 248 0.002730    87

after library preparation I get read numbers that make me perfectly happy:

1:95302304-95320982C    gene_123_iso_1  CDS 697 0.003577    114
0.00209560367060517 1747
1:95308664-95320982C    gene_123_iso_2  CDS 670 0.003138    100
0.0017033527259641336   1420
1:203830743-203839678W  gene_1180_iso_1 CDS 942 0.003326    106
0.002295927547531938    1914
1:203832753-203839179W  gene_1180_iso_2 CDS 355 0.003138    100
0.001194745996521322    996
1:203830731-203839209W  gene_1180_iso_3 CDS 967 0.003169    101
0.0022443471480837283   1871
1:203830731-203839212W  gene_1180_iso_4 CDS 532 0.003514    112
0.001631380075571283    1360
1:203830737-203839205W  gene_1180_iso_5 CDS 478 0.002699    86
0.0012043423499070354   1004
1:203830978-203834247W  gene_1180_iso_6 CDS 248 0.002730    87
8.348827445570684E-4    696

but after sequencing I get this:

1:95302304-95320982C    gene_123_iso_1  CDS 697 0.003577    114
0.00209560367060517 0.011101079589527092    1746
1:95308664-95320982C    gene_123_iso_2  CDS 670 0.003138    100
0.0017033527259641336   0.0 0
1:203830743-203839678W  gene_1180_iso_1 CDS 942 0.003326    106
0.002295927547531938    0.0 0
1:203832753-203839179W  gene_1180_iso_2 CDS 355 0.003138    100
0.001194745996521322    0.0 0
1:203830731-203839209W  gene_1180_iso_3 CDS 967 0.003169    101
0.0022443471480837283   0.0 0
1:203830731-203839212W  gene_1180_iso_4 CDS 532 0.003514    112
0.001631380075571283    0.0 0
1:203830737-203839205W  gene_1180_iso_5 CDS 478 0.002699    86
0.0012043423499070354   0.0 0
1:203830978-203834247W  gene_1180_iso_6 CDS 248 0.002730    87
8.348827445570684E-4    0.0 0

most of the transcripts have no reads assigned.
What am I doing wrong?

This is my parameter file:

REF_FILE_NAME   genes.gtf
PRO_FILE_NAME   genes_expr_weak_bias1.pro
LIB_FILE_NAME   genes_expr_weak_bias1.lib
SEQ_FILE_NAME   genes_expr_weak_bias1.bed
GEN_DIR hg19/
NB_MOLECULES    20000000
EXPRESSION_K    -0.6
EXPRESSION_X0   5.0E7
EXPRESSION_X1   9500.0
RT_MIN  10
RT_MAX  10000
FRAGMENTATION   YES
LOAD_CODING YES
LOAD_NONCODING  YES
FILTERING   NO
READ_NUMBER 10000000
READ_LENGTH 75
PAIRED_END  YES
TMP_DIR /tmp/global2/data_sim/
POLYA_SHAPE 2
POLYA_SCALE 300
ERR_FILE_NAME   genes_expr_weak_bias1.err
RT_PRIMER   RANDOM
FRAG_B4_RT  YES
FRAG_MODE   CHEMICAL
FRAG_LAMBDA 500.0
FASTQ   NO
QTHOLD  0.0
FRAG_SIGMA  5.000000e-02
FRAG_THRESHOLD  1.000000e-01

I am using an older version of the flux simulator (built 20101223), because
the new versions (4 and 5) died
during library generation with an null pointer exception:

[LIBRARY] Configuration
               Rounds: 15
               Mean: 0.5
               Standard Deviation: 0.1

       Processing Fragments * FAILED
[ERROR] Error while fragmenting : null
java.lang.NullPointerException
       at
fbi.genome.sequencing.rnaseq.simulation.fragmentation.Amplification.getGCcontent
(Amplification.java:135)
       at
fbi.genome.sequencing.rnaseq.simulation.fragmentation.Amplification.process(Ampl
ification.java:100)
       at
fbi.genome.sequencing.rnaseq.simulation.fragmentation.Fragmenter.process(Fragmen
ter.java:545)
       at
fbi.genome.sequencing.rnaseq.simulation.fragmentation.Fragmenter.call(Fragmenter
.java:245)
       at
fbi.genome.sequencing.rnaseq.simulation.SimulationPipeline.call(SimulationPipeli
ne.java:339)
       at
fbi.genome.sequencing.rnaseq.simulation.SimulationPipeline.call(SimulationPipeli
ne.java:32)
       at fbi.commons.flux.Flux.main(Flux.java:182)

any hints on that would also be helpful.

Thanks in advance,
Jonas
Original issue reported on code.google.com by gmicha@gmail.com on 9 Nov 2011 at 8:34
sivarajankumar / fluxcapacitor

Problems during sequencing #69