Open hagarelsayed opened 4 years ago
cat counts.txt | cut -f 1,7-12 > simple_counts.txt
Simple counts produced simple_counts.txt
All the schedule is Zero may be because of the low alignment rate for the small genome
while trying to do the differential analysis by Deseq,
cat simple_counts.txt | Rscript deseq1.r 3x3 > results_deseq1.tsv
The following error came up :
Error in parametricDispersionFit(means, disps) :
Parametric dispersion fit failed. Try a local fit and/or a pooled estimation.
The error may be due to values on the matrix is zero which was a reult of aligning to a small portion of the reference genome Now will try to get back to the alignment step to index another genome
Data Subsetting to an even smaller size
Preparing Environment
Subset small number ; For test only;
for file in ./*.fastq.gz ; do echo $file ; seqtk sample -s100 $file 500 > ${file/.fastq.gz/.fastq}; done
Alignment
Choose ERCC to work on
Results of the first set;
500 reads; of these: 500 (100.00%) were paired; of these: 478 (95.60%) aligned concordantly 0 times 22 (4.40%) aligned concordantly exactly 1 time 0 (0.00%) aligned concordantly >1 times
4.90% overall alignment rate
500 reads; of these: 500 (100.00%) were paired; of these: 481 (96.20%) aligned concordantly 0 times 19 (3.80%) aligned concordantly exactly 1 time 0 (0.00%) aligned concordantly >1 times
4.30% overall alignment rate
500 reads; of these: 500 (100.00%) were paired; of these: 487 (97.40%) aligned concordantly 0 times 13 (2.60%) aligned concordantly exactly 1 time 0 (0.00%) aligned concordantly >1 times
2.90% overall alignment rate
Code for second set;
Results of second set :
500 reads; of these: 500 (100.00%) were paired; of these: 487 (97.40%) aligned concordantly 0 times 12 (2.40%) aligned concordantly exactly 1 time 1 (0.20%) aligned concordantly >1 times
2.90% overall alignment rate 500 reads; of these: 500 (100.00%) were paired; of these: 486 (97.20%) aligned concordantly 0 times 14 (2.80%) aligned concordantly exactly 1 time 0 (0.00%) aligned concordantly >1 times
3.30% overall alignment rate 500 reads; of these: 500 (100.00%) were paired; of these: 485 (97.00%) aligned concordantly 0 times 14 (2.80%) aligned concordantly exactly 1 time 1 (0.20%) aligned concordantly >1 times
3.20% overall alignment rate
Quantification
The following error came up :
Failed to open the annotation file /home/ngs/workdir/diff_exp/ref/ERCC92.gtf, or its format is incorrect, or it contains no 'exon' features
The Reference genome changed to
GTF=~/workdir/sample_data/gencode.v29.annotation.gtf
The feature count worked smoothly and this is the out put resultsResults of Quantification:
The results could not be uploaded to git but found at this link
Results of Quantification